Logistics
Review: Achieving Scale
- Measuring performance:
- How many Xs per second?
- and/orhow long does it take to Y?
- Analysis
- Instrumentation (basically logging)
- Deep thought
- Identify the bottle neck
- Action
- Remember: One of the cardinal “sins” is optimizing early
- Instead, optimize based on measurement
- Discover which parts of your product’s features is causing a scaling problem
- Consider which of your techniques might be brought to bear
Scalability Pattern: General Caching
What is caching?
- Save the result of a request with a given set of parameters.
- In a future request with the same parameter (maybe) return the same result
- System level caching. Storage:
- In ‘local’ memory
- In ‘remote’ memory
- In database
- In Cloud
Review of storage system architectural hierarchy
- Processor
- Cores
- Caches
- On board memory
- Offboard Memory
- Very different speeds depending on cost
- On a special very fast connection (
bus
)
- External local storage
- USB connected SSD
- Ethernet Connected storage
- On local LAN
- Ethernet connected storage
- On separate network (internet/cloud)
Cost of operations
- Awareness of order of magnitude speed of operations:
- Access registers inside CPU
- Access CPU caches
- Access standard RAM
- Access local disk
- Access files
- Access local database
- Access over network
- To a nearby server
- To a nearby database server
- Access over the internet
- To a remote server
- To a remote database server
- To a remote Web Service
Memoization:
- caching applied to an individual method
- A basic programming technique
- Simple
Name-value databases
- Very fast searches and lookups
- Distributed searches and distributed databases
- Robust across system and application failures
Database Caching
- To a certaine extent, it’s what databases do
- Caching both at the server (postgres itself)
- And at the client (the postgres and activerecord subsystems)
- Yet a lot more can be done
HTML page caching
- Done at the web server
- Don’t regenerate the page if it’s requested again
- As long as you know it hasn’t changed
- Page fragment caching, including “russian doll caching”
- A key feature of good frameworks
Caching with “Redis”
Advantages
- Blindingly fast
- Many data types: list, set, sorted set and hashes
- Atomic operations
- Has many uses: caching, message queue, publish subscribe, sharing application global state
An instance of “network caching”
- Evolved from the original
cached
- Typical structure is a key-value store
- A nosql database. But in memory!
- Ruby bindings
gem redis
Wait, where’s the data actually stored?
- A redis host, accessible by tcp/ip: dns name + port number
- You can run it:
$ redis-server
- Heroku can run it for you with Redis to go. Nano size is free!
- In all cases, if the host dies, the data is gone (not 100% true)
It has some interesting characteristics
- ATOMIC operations, e.g. “INCR” operation
- keys that expire (TTL)
- Supports other values: lists, sets, hashes
- And many many more
Heroku
- Can provide a basic free instance of it
- Remember it has a URL and can be shared across applications
heroku redis:cli
Redis Concepts
- Play with Redis
- Keys
- are text with colons, e.g. global:usercount by convention
- but can be anything. You decide your structure. Colons are recommended.
- Values
- Are strings
- Or compoounds: lists, sets, sorted sets, hashes
- Note we play with commands (a kind of a REPL)
- But you will be doing API calls
SET key value # store a singular key and give it a value
GET key # retrieve its value
INCR key # add one to key value, should be an integer. Atomic!
DECR key # delete one from key value
EXPIRE key seconds # key will cease to exsit `seconds` later
RPUSH key value # Append value to list
LPUSH key value # Prepend value to list
LPOP key # Remove first value
RPOP # Remvoe last value
SADD key value # Add value to set
SREM key # Remove value from set
SISMEMBER key value # Is value a member of set
SUNION key1 key2 # Get union of two sets
ZADD key score value # Add value with score to sorted set
HSET hashname hashkey value # Add entry to hash
HHET hashname hashkey # Retrieve a value from hash
Putting the it together
- Redis is a power tool!
- Don’t be scared of putting lots of information into it
- Remember that it’s a cache. You need to have real persistent store to recover
- Example:
- Display list of 50 most recent posts for users who are followed by user uid
- Key is: 50_tweets_for_user:uid
- Value is: ordered list of tweet ids
- Processing:
- When list is displayed
- When user :u tweets
Thank you. Questions? (random Image from picsum.photos)