NanoTwitter Functionality slides |

Details of the functionality (looking from the outside) that nanoTwitter requires

Minimum nT Functionality

General Notes

  • NanoTwitter (nT) is a baby version of Twitter designed as a platform for experimentation with scaling issues.
  • The list of features and URLs is quite incomplete. It is meant to set the pattern for you to build upon.
  • All the URLs, both for the User Interface and for the Web Services API will attempt to follow REST design principles as far as possible.

Users

  • Can register for an account by supplying at least an email and a password (plus confirmation password)
  • Can never deleted
  • Are assigned a numeric id (the primary key) which will be used in certain APIs

Authentication

  • All pages include login/logout and register links
  • If a user is logged in then the ogin/logout link says logout and otherwise it will say login
  • Authentication will use a simplistic hashed password in the User record
  • We will not be concerned for this exercise with having tight security

Non-logged in users:

  • Will not be able to do anything other than register an account or log into an existing account

Logged in users:

  • Can follow and un-follow other registered users
  • Can create Tweets
  • Can see the flow of the last 50 tweets by the users that they have followed
  • Can search for a phrase
  • Can see a list of users who are following them
  • Can see a list of the users that they are following
  • Can see a list of tweets referring to a particular “#tag”

Tweets:

  • Consist of
    • a 280 characters of text
    • a date-time of creation
    • zero or more @mentions of other users
    • zero or more “#tags of topics”
  • Belong to one user
  • Can never be deleted
  • A page with at least a search box
  • User can enter a phrase or a “@username” or a “#tag” or some combination
  • Result of matching is displayed
  • Top 50 are displayed with next and previous links

General User Interface

  • Basic bootstrap html/css/javascript interface at minimum
  • Clean but doesn’t have to be fancy

Routes

  • Home page page: /
  • Specific user’s page: /user/1234
  • Search form: /search
  • Display Full Text Search results /search?phrase=
  • Display the user registration page: /register
  • Display user login prompt, and check for correct password: /login
  • Logout: /logout

Test Interface

The Test Interface essentially is a set of URLs which invoke special functionality that allows your nanoTwitter to be tested from a browser, and is the foundation of the scalability testing we will be doing.

It works by implementing some special routes (that all start /test/xxxx) that do operations which a normal user would never do. Because of this, when this was really deployed, you would protect those routes through the firewall to make sure that no-one but the developers could access them.

These routes will be used during the load testing to reset the server, create lots of fake users, measure performance and so on. It’s analogous to the Onboard Diagnostics Interface that all cars have. There’s a plug near the steering wheel that you probably never noticed, where the mechanic (or you) can plug an instrument to inspect the internal condition of the car.

The Test user

There’s a user that we use as part of many of the tests. We refer to the user as “testuser”. When you create that user use the following attributes:

  • name: testuser
  • email: testuser@sample.com
  • password: “password”

Special url argument

  • To facilitate testing, any URL can have ?user_id=x appended which will bypass the login and automatically make this NT session logged in as user x. For example:
  • GET /?user_id=x would show the homepage as if user_id x was logged in.

Test Data

  • Url to reset and load the standard seed is: /test/reset/standard?tweets=n&users=u&follows=u
  • Dataset is the standard Seed Data /
  • If necessary, you may modify the creation dates of tweets, but all the other data should be intact
  • Our tests will require the complete set of records

Test Interface

GET /test/reset?user_count=u

  • Deletes all users, tweets and follows
  • Recreate TestUser
  • Imports data from standard seed data, see: Seed Data /
    • ?user_count=n means to import n users from the seed data…
    • Including all the related follows. The user mentioned in the follow also gets imported which means you end up with more than n users. But just go one deep.
  • Example: /test/reset?user_count=10 means to import 10 users from the seed data plus their tweets and follows.
  • Returns 200 or 400

GET /test/tweet?user_id=x&tweet_count=y

  • {x} is the user id of some user
  • n is how many randomly generated tweets are submitted on that users behalf
  • Example GET ../test?user_id=123&tweet_count=22 will generate 22 random tweets for user 123
  • Returns 200 or 400

GET /test/status

  • One page “report”:
    • How many users, follows, and tweets are there
    • What is the TestUser’s id
  • Returns 200 or 400
  • Example: /test/status

GET /test/validate?n=n&star=u1&fan=u2

  • Checks for valid processing of the app
  • The purpose to detect that the nt back end is faithfully storing and retrieving the data
  • This is not necessarily so because a very easy kind of concurrency bug leads to cache mistakes
  • star and fan are existing users user_ids
  • In one request, does the following process
    1. Make sure that user fan is following user star and if not, make it so
    2. Have user star create n tweets using faker.
    3. Don’t just use Tweet.create; use the code that is actually executed when a user submits a tweet
    4. This will invoke whatever optimizations, follower handling which may include redis, queues, or other optimizations.
    5. Store the tweet ids and the content for later.
    6. You can simply use two arrays, one of the faker text for each tweet and one of the resultant id of the tweet
  • Once complete query the backend for those same ids
    1. Again don’t just use Tweet.find; use the code that is actually executed when displaying a tweet in the ui
    2. Similarly, it will use whatever optimizations, redis, queues and so on.
  • Query the timeline of fan to make sure that the new tweets are listed
    1. Again, use the code that is used by the front end not just the low level code.
  • Note: this is not checking HTML it is bypassing that and using the lower level internal methods (you can decide.)
  • Returns 200 or 400

GET /test/corrupted?user_count=u

  • A corruption checking algorithm is provided as a set of complementary methods
  • The url will call the corrupted? method on n randomly chosen users
  • Returns 200 or 400

Corruption Checks on User and Tweet

  • On the User model: user.corrupted?()
  • For this user,
    1. Check that each other user that it follows, has it as a followed
    2. Check that each tweet that the user issued is not corrupted
  • On the Tweet model: tweet.corrupted?()
  • For this Tweet
    1. Check that it has valid values in all the required fields

GET /test/stress?n=n&star=u1&fan=u2

  • Checks for valid processing of the app by stressing it
  • The purpose to detect that the nt back end is faithfully storing and retrieving the data
  • This is not necessarily so because a very easy kind of concurrency bug leads to cache mistakes
  • star and fan are existing users user_ids
  • In one request, does the following process
    1. Make sure that user fan is following user star and if not, make it so
    2. Have user star create n tweets using faker.
    3. Don’t just use Tweet.create; use the code that is actually executed when a user submits a tweet
    4. This will invoke whatever optimizations, follower handling which may include redis, queues, or other optimizations.
    5. Store the tweet ids and the content for later.
    6. You can simply use two arrays, one of the faker text for each tweet and one of the resultant id of the tweet
  • Once complete
    1. query the backend for those same ids
    2. Again don’t just use Tweet.find; use the code that is actually executed when displaying a tweet in the ui
    3. Similarly, it will use whatever optimizations, redis, queues and so on.
    4. query the timeline of fan to make sure that the new tweets are listed
    5. Again, use the code that is used by the front end not just the low level code.
  • Note: this is not checking HTML it is bypassing that and using the lower level internal methods (you can decide.)
  • Returns 200 or 400

Scalability Testing Protocol

Here’s how testing of scaling will be done with loader.io. You need to make sure that your version of nanoTwitter performs as well as possible in these scenarios!

Setup

Before we can run each standardized test, we want to get each server to a known state we do the following commands directly from a browser. We want to make sure that the complete seed data is loaded. We will ask you to show us this before the test. The numbers are approximately:

  • x users
  • y tweets
  • z relations

Required urls (all have to be GET)

  • /?user_id=n
  • /search?phrase=word
  • /test/user/tweet?user_id=x&tweet_count=y
  • /test/reset?count

Test Scripts

  1. Rotate through 5 different user-ids, including testuser
  2. Rotate through 10 different urls, as follows:
  3. 7 times /?user_id=n
  4. 2 times `/test/tweet?user_id=x&count=y
  5. 1 time /search?phrase=z

Test runs - using maintain client load

  1. Run payload with u=250
  2. Run payload with u=500
  3. Run payload with u=1000
  4. Run payload with u=2000

Data Collected

  • Ave response time
  • Worse response time
  • NUmber of timeouts
  • Number of successes