Queueing (Thu Apr 7, lect 22) | previous | next | slides |

Queuing and message passing are basic techniques in scaling

Logistics

  • Secret code:

Scalability Pattern: Queueing

Queueing and SOA

  • SOA: Divide software into independent “services” that talk to each other
    • Client: code that needs the service
    • Server: code that implements the service
  • Previously: SOA required REST between servers
Discussion: Why is REST/HTTP between servers usually a synchronous request

Queueing

  • Queueing is a fundamentally different way of creating services and concurrency
  • Syncrhonous:
    1. Client makes a request (HTTP get) and waits
    2. Server Receives request
    3. Server does the service, during which client and server are waiting
    4. Server sends response
    5. Client is free to continue processing
  • Asynchronous
    1. Client makes a request (‘Publish’ to a queue)
    2. Client can now continue
    3. Server picks next item off queue and does the service.
    4. When service complete, notifies client to use the result.
  • That’s the theoretical framework

A small case study

  • Need: a service to send emails
  • Example: nanoTwitter, user wants to be notified by email when they are followed
Discussion: There are at least 4 patterns that you can use to implement this. What are the tradoffs?

In nanoTwitter Logic:

  • Decide on the contents of the email, as a template (e.g. erb)
  • When a follow request is processed (during request handling)
    • Populate the template with info about who is following who
    • Retrieve the email address
    • Initiate the sending of the email

How to initiate sending of email?

  1. Inline:
    • Write a method which does the work, which will cause the email to be packaged up and sent to the mail server. Once the mail server accepts the email, execution continues.
    • Problems: Request handling is delayed and so the whole site hangs for each follow request: UNACCEPTABLE
  2. Thread * Write a method that does the work (like above) make the mail server request in a thread.
  3. Seperate SOA / REST Service
    • Write a REST service (nanoTwitterEmailer) which accepts an EmailRequest and returns immediately to allow the main line code to continue. Email is sent to outside Mail server in a seperate process by nanoTwitterEmailer.
    • Problems: How are error handled? What if the REST Service crashes, or hangs? What if it is flooded with requests? How do we keep track of the emails that need to be sent? What if the whole nanoTwitterEmailer service dies?
  4. Message Queue
    • Use an EmailRequest message queue to collect info about the email that has to be sent in it’s queue. Configure nanoTwitterEmailer to subscribe to a message queue and process the EmailRequests in the order they were submitted.
    • Result: The EmailRequest message queue robustly tracks the email requests. If the nanoTwitterEmailer dies, then when it comes back up the requests are still there. If there are too many requests, it’s simple to run a second nanoTwitterEmailer.

RabbitMQ: One of several implementations

  • Runs in a process of it’s own
  • Defines the following terminology
    1. Producers: Software that sends messages
    2. Consumers: Software that consumes (and processes) messages
    3. Queues: FIFO “holding area” for messages
    4. Message: A bit of data which often incorporates an action or command
    5. Exchange: Receives messages from producers, and sticks them into a queue
    6. Acknowledgement: So that if the server dies before finishing, the message gets given to someone else
    7. Durability: So that if the Queueing Service dies, the content of the queue survives

Demonstration

Queues and Caches compared

  • Both are techniques to achieve more scale
  • More scale requires finding and “fixing” bottlenecks
  • “Bottleneck” = “A narrow place”
  • A slow operation at a busy place
  • How to speed up?

Caching

  • Is slow data retrieval the problem?
  • Is it ok for the data to be a little bit stale?
  • Does access follow repeating patterns?
  • Can process continue even without the result?

Queueing

  • Is slowness of the actual procedure the problem?
  • Is the operation done for its “side effects”?
  • Can the process continue even without the operation completion?

References used in this talk