NanoTwitter Functionality slides |

Details of the functionality (looking from the outside) that nanoTwitter requires

Minimum nT Functionality

General Notes

NanoTwitter (nT) is a baby version of Twitter designed as a platform for experimentation with scaling issues.
The list of features and URLs is quite incomplete. It is meant to set the pattern for you to build upon.
All the URLs, both for the User Interface and for the Web Services API will attempt to follow REST design principles as far as possible.

Users

Can register for an account by supplying at least an email and a password (plus confirmation password)
Can never deleted
Are assigned a numeric id (the primary key) which will be used in certain APIs

Authentication

All pages include login/logout and register links
If a user is logged in then the ogin/logout link says logout and otherwise it will say login
Authentication will use a simplistic hashed password in the User record
We will not be concerned for this exercise with having tight security

Non-logged in users:

Will not be able to do anything other than register an account or log into an existing account

Logged in users:

Can follow and un-follow other registered users
Can create Tweets
Can see the flow of the last 50 tweets by the users that they have followed
Can search for a phrase
Can see a list of users who are following them
Can see a list of the users that they are following
Can see a list of tweets referring to a particular “#tag”

Tweets:

Consist of
- a 280 characters of text
- a date-time of creation
- zero or more @mentions of other users
- zero or more “#tags of topics”
Belong to one user
Can never be deleted

Search

A page with at least a search box
User can enter a phrase or a “@username” or a “#tag” or some combination
Result of matching is displayed
Top 50 are displayed with next and previous links

General User Interface

Basic bootstrap html/css/javascript interface at minimum
Clean but doesn’t have to be fancy

Routes

Home page page: /
Specific user’s page: /user/1234
Search form: /search
Display Full Text Search results /search?phrase=
Display the user registration page: /register
Display user login prompt, and check for correct password: /login
Logout: /logout

Test Interface

The Test Interface essentially is a set of URLs which invoke special functionality that allows your nanoTwitter to be tested from a browser, and is the foundation of the scalability testing we will be doing.

It works by implementing some special routes (that all start /test/xxxx) that do operations which a normal user would never do. Because of this, when this was really deployed, you would protect those routes through the firewall to make sure that no-one but the developers could access them.

These routes will be used during the load testing to reset the server, create lots of fake users, measure performance and so on. It’s analogous to the Onboard Diagnostics Interface that all cars have. There’s a plug near the steering wheel that you probably never noticed, where the mechanic (or you) can plug an instrument to inspect the internal condition of the car.

The Test user

There’s a user that we use as part of many of the tests. We refer to the user as “testuser”. When you create that user use the following attributes:

name: testuser
email: testuser@sample.com
password: “password”

Special url argument

To facilitate testing, any URL can have ?user_id=x appended which will bypass the login and automatically make this NT session logged in as user x. For example:
GET /?user_id=x would show the homepage as if user_id x was logged in.

Test Data

Url to reset and load the standard seed is: /test/reset/standard?tweets=n&users=u&follows=u
Dataset is the standard Seed Data /
If necessary, you may modify the creation dates of tweets, but all the other data should be intact
Our tests will require the complete set of records

Test Interface

`GET /test/reset?user_count=u`

Deletes all users, tweets and follows
Recreate TestUser
Imports data from standard seed data, see: Seed Data /
- ?user_count=n means to import n users from the seed data…
- Including all the related follows. The user mentioned in the follow also gets imported which means you end up with more than n users. But just go one deep.
Example: /test/reset?user_count=10 means to import 10 users from the seed data plus their tweets and follows.
Returns 200 or 400

`GET /test/tweet?user_id=x&tweet_count=y`

{x} is the user id of some user
n is how many randomly generated tweets are submitted on that users behalf
Example GET ../test?user_id=123&tweet_count=22 will generate 22 random tweets for user 123
Returns 200 or 400

`GET /test/status`

One page “report”:
- How many users, follows, and tweets are there
- What is the TestUser’s id
Returns 200 or 400
Example: /test/status

`GET /test/validate?n=n&star=u1&fan=u2`

Checks for valid processing of the app
The purpose to detect that the nt back end is faithfully storing and retrieving the data
This is not necessarily so because a very easy kind of concurrency bug leads to cache mistakes
star and fan are existing users user_ids
In one request, does the following process
1. Make sure that user fan is following user star and if not, make it so
2. Have user star create n tweets using faker.
3. Don’t just use Tweet.create; use the code that is actually executed when a user submits a tweet
4. This will invoke whatever optimizations, follower handling which may include redis, queues, or other optimizations.
5. Store the tweet ids and the content for later.
6. You can simply use two arrays, one of the faker text for each tweet and one of the resultant id of the tweet
Once complete query the backend for those same ids
1. Again don’t just use Tweet.find; use the code that is actually executed when displaying a tweet in the ui
2. Similarly, it will use whatever optimizations, redis, queues and so on.
Query the timeline of fan to make sure that the new tweets are listed
1. Again, use the code that is used by the front end not just the low level code.
Note: this is not checking HTML it is bypassing that and using the lower level internal methods (you can decide.)
Returns 200 or 400

`GET /test/corrupted?user_count=u`

A corruption checking algorithm is provided as a set of complementary methods
The url will call the corrupted? method on n randomly chosen users
Returns 200 or 400

Corruption Checks on User and Tweet

On the User model: user.corrupted?()
For this user,
1. Check that each other user that it follows, has it as a followed
2. Check that each tweet that the user issued is not corrupted
On the Tweet model: tweet.corrupted?()
For this Tweet
1. Check that it has valid values in all the required fields

`GET /test/stress?n=n&star=u1&fan=u2`

Checks for valid processing of the app by stressing it
The purpose to detect that the nt back end is faithfully storing and retrieving the data
This is not necessarily so because a very easy kind of concurrency bug leads to cache mistakes
star and fan are existing users user_ids
In one request, does the following process
1. Make sure that user fan is following user star and if not, make it so
2. Have user star create n tweets using faker.
3. Don’t just use Tweet.create; use the code that is actually executed when a user submits a tweet
4. This will invoke whatever optimizations, follower handling which may include redis, queues, or other optimizations.
5. Store the tweet ids and the content for later.
6. You can simply use two arrays, one of the faker text for each tweet and one of the resultant id of the tweet
Once complete
1. query the backend for those same ids
2. Again don’t just use Tweet.find; use the code that is actually executed when displaying a tweet in the ui
3. Similarly, it will use whatever optimizations, redis, queues and so on.
4. query the timeline of fan to make sure that the new tweets are listed
5. Again, use the code that is used by the front end not just the low level code.
Note: this is not checking HTML it is bypassing that and using the lower level internal methods (you can decide.)
Returns 200 or 400

Scalability Testing Protocol

Here’s how testing of scaling will be done with loader.io. You need to make sure that your version of nanoTwitter performs as well as possible in these scenarios!

Setup

Before we can run each standardized test, we want to get each server to a known state we do the following commands directly from a browser. We want to make sure that the complete seed data is loaded. We will ask you to show us this before the test. The numbers are approximately:

x users
y tweets
z relations

Required urls (all have to be GET)

/?user_id=n
/search?phrase=word
/test/user/tweet?user_id=x&tweet_count=y
/test/reset?count

Test Scripts

During the test “someone” may try to log in by hand to make sure things are running ok
We will use the Payload File feature of loader.io
I will provide one.
- Here’s an example payload file
- And another example payload file

Rotate through 5 different user-ids, including testuser
Rotate through 10 different urls, as follows:
7 times /?user_id=n
2 times `/test/tweet?user_id=x&count=y
1 time /search?phrase=z

Test runs - using maintain client load

Run payload with u=250
Run payload with u=500
Run payload with u=1000
Run payload with u=2000

Data Collected

Ave response time
Worse response time
NUmber of timeouts
Number of successes