Monday, 12 March 2018

Amazon SWF

Overview
  • Workflow orchestration service
  • Use Cases
    • Coordination of tasks
    • Media processing
    • web application back-ends
    • business process workflows
    • analytics pipelines
  • Alternatives
    • Step Functions

Model
  • Workflow - set of activities that carry out a "business process"
    • Max duration: 1 year
  • Workflow Execution - invocation of workflow (unique runId)
  • Workflow History 
    • Detailed record of every event (change) that ocurred for the workflow
    • Passed to DECIDER to decide about next step (i.e. context)
    • Authoritative source of information about workflow exeuction
    • Allows application (i.e. DECIDER) to be stateless
    • Audit trail
    • Retained for configurable period of time
      • Max 90 days
  • Domain - logical container for the workflow
    • Workflows in different domains cannot interact
    • Retention period is configured here
  • Activity - name, version, timeout
    • Registered with SWF
  • Activity Task - one invocation of Activity
  • Long Polling
    • mechanism in which Activity Workers and Deciders receive their tasks
    • 60 seconds TCP timeout
      • Client must retry after this
      • Similar to SQS ReceiveTimeout

Actors
  • Workflow Starter
    • Example: website where you put an order
  • Decider
    • Holds coordination logic
    • Receives Decision Taks along with the context (state + workflow history)
      • Schedules Activity Task (decides about next task)
  • Activity Worker - program that receives Activity Task and performs the actual work
    • Polls for tasks it is capable of handling
    • Can be performed manually (e.g. by statistical analyst) using software to interact with SWF
    • Represents single "Thread" of execution

Data Exchange
  • Input provided by Workflow Starter
  • Activity Worker can return results to SWF
  • Decider can return completion result to SWF
  • Can pass user data between tasks (e.g. pointers) - recorded in the history
    • Acts as a scratchpad

Task
  • Assignment of work (e.g. charge credit card)
  • Stored by SWF and assigned to exactly one worker (decider or activity worker)
  • SWF tracks progress and maintains state
  • Types
    • Activity Task
      • processed by Activity Worker
      • Lambda Task - same as above but invokes Lambda function
    • Decision Task
      • determine the next step in the workflow execution
      • e.g. Worfklow Started, Activity Task Completed, 

Task List
  • Queue of tasks that were scheduled
  • Separate for Decider and each Activity Type
  • Best effort ordering
  • Default task lists may be overridden with custom task lists (i.e. task routing) e.g. for 
    • Geographic distribution
    • Prioritization

Workflow Execution Closer
  • Decider makes an explicit decision
    • CompleteWorkflowExecution
    • CancelWorkflowExecution
    • FailWorkflowExecution
    • ContinueAsNewWorkflowExecution
      • Useful for long-running workflows (or when history gets too big)
  • SWF Timeout
    • Workflow: START_TO_CLOSE
      • Worfklow is closed
      • WorkflowExecutionTimedOut event in history
    • Decision Task: START_TO_CLOSE
      • Task marked as timed out
      • DecisionTaskTimedOut event in history
      • New decision task scheduled (retry but only once)
    • Activity Task
      •  START_TO_CLOSE
        • Cannot close activity as {Completed, Canceled, Failed} anymore
      • HEARTBEAT
      • SCHEDULE_TO_START
        • How long it waits in the task list
      • SCHEDULE_TO_CLOSE
      • In each case
        • Event entry added
        • New Decision Task added

Lifecycle
  • Workflow Starter
    • SWF
      • Decider
    • SWF
      • Activity Task
    • SWF
      • Decider
    • SWF
      • ...


Advanced Features
  • Versioning
    • Allows different type of workers run simultanously
    • Free-form string
    • Associated with workflow and activity type
  • Signaling
    • Allows external process to inject data into open workflow
      • Entry added to history + decision task scheduled
      • Use cases
        • Resume paused workflow
        • Notify about external event (e.g. market closed)
  • Child Workflows
    • Workflow initiated by Decider
    • Parent waits for the child to complete
    • Completion recorded in parent history
    • Child Policy - what to do when parent terminates
      • TERMINATE
      • CANCEL
      • ABANDON (i.e. let it live)
    • Examples - reusable parts
      • Credit Card Authorization
      • Email
  • Markers
    • Record arbitrary information in the history
    • Deciders responds with RecordMarker decision
    • Examples
      • Counter of loops
      • Summarized information
  • Tags
    • Associate workflow execution with tags
    • Ability to query
    • 256 characters assigned when workflow execution started
    • Examples
      • List workflows for specific fullfilment center
  • Timer
    • Ability to delay for a specified amount of time

Cancellation
  • Cooperative (no forcible interruption)
  • Cancel received by Decider
  • Inherent race condition between Decider and Activity Worker
  • Activity Worker uses task-heartbeat to receive cancellation request
    • Hearbeat optional but recommended for long-running tasks precisely so that they can be cancelled
    • Activity Worker may respond with RespondActivityTaskCanceled

How to create workflow
  • Analyze application and identify discrete tasks (e.g. charge credit card)
  • Write Activity Workers 
  • Write Decider
  • Register Activities and Workflow with AWS
  • Start Activity Workers and Deciders
  • Initiate Workflow (Workflow Starter Application)
  • View Workflow Execution

Flow Framework
  • Programming model for SWF
  • Imposes certain architecture on the application
  • Platforms: Java, Ruby (deprecated)

No comments:

Post a Comment