smart-answers: Flow
At the heart of each Smart Answer is a subclass of SmartAnswer::Flow
. Subclasses like this make a DSL available for specifying metadata, and defining all the question nodes, the outcome nodes, and the rules to control routing between the nodes.
Naming
The flow filename should be based on the path where the Smart Answer is to be found on gov.uk. For example, for a Smart Answer at https://www.gov.uk/example-smart-answer, the flow file should be named app/flows/example_smart_answer_flow.rb
. Similarly the template directory should be named app/flows/example_smart_answer_flow/
.
The flow class should be a camel-case version of the flow filename e.g. in the case above it would be ExampleSmartAnswerFlow
. The class should inherit from SmartAnswer::Flow
.
For example:
# app/flows/example_smart_answer_flow.rb
class ExampleSmartAnswerFlow < SmartAnswer::Flow
def define
# flow definition specified here
end
end
Definition
The flow is defined within the define
instance method of the Flow
subclass using a DSL to specify metadata, question nodes and outcome nodes.
Unfortunately this DSL is overly complex and makes it all too easy to create code which is very hard to follow. The DSL makes heavy use of Ruby's
BasicObject#instance_eval
&#instance_exec
. This means that:
- It's not obvious what variables are in scope within a given block of code.
- It's all too easy to write arbitrary custom code which can make each flow a bit of a special case. This doesn't help on the maintainability front.
- It's confusing that blocks of code are not necessarily executed in the order that they appear in the flow definition.
- The code in the blocks is not easily unit-testable.
Each flow is instantiated in the FlowRegistry
via the Flow.build
method which, in turn, instantiates the flow and invokes its Flow#define
method.
Currently when the app is running in the Rails development
& production
environments (but not in test
) a single instance of each flow class is instantiated at application start-up (or strictly speaking on the first request) and cached in the FlowRegistry
i.e. the same instance of a particular flow class is used to service all the user requests involving that Smart Answer. Thus it's important that no request-specific state is stored on the flow instance, otherwise this could leak into other user requests.
It's important to understand this distinction between "flow definition time" and "request time" when trying to understand a flow definition. A number of the sections below refer to this distinction.
Metadata
There are a number of methods on SmartAnswer::Flow
which allow "metadata" for the flow to be specified:
class ExampleSmartAnswerFlow < SmartAnswer::Flow
def define
name 'example-smart-answer' # this is the path where the Smart Answer will be registered on gov.uk (via the publishing-api)
content_id "bfda3b4f-166b-48e7-9aaf-21bfbd606207" # a UUID used by v2 of the Publishing API (?)
status :published # this indicates whether or not the flow is available to be published to live gov.uk , those with `:draft` status will be available on draft gov.uk
# question & outcome definitions specified here
end
end
Arbitrary Ruby code
Since the flow definition is just the body of a standard Ruby method, it's possible to write arbitrary Ruby code at any point within it. Arbitrary Ruby code is pretty much anything other than calls to the DSL methods described in this document. The use of a local variable at the top-level of the #define
method and used later within a block is an example of arbitrary Ruby code which is explained in more detail below.
It should be possible to implement the majority of flows without writing such arbitrary Ruby code. Any decision to introduce such arbitrary code should be very carefully considered, particularly regarding it's maintainability.
Some exceptions to the "no arbitrary Ruby" rule are: instantiating and storing a "calculator" object (in an
on_response
block), storing a response on the "calculator" object, and conditional logic within anext_node
block. Although even the latter should be kept to a minimum by extracting predicate methods onto the "calculator" object.
Since the flow definition is just the body of a Flow
instance method (#define
), the value of self
at the "top-level" within the method is an instance of the flow.
Local variables
Some flows (e.g. register-a-birth) define local variables towards the top of the flow definition. Standard Ruby scoping means that these are then available within question & outcome blocks, as well as within blocks nested inside these blocks. However, note that the value of the local variable is set at flow definition time and not at request time. Since a single flow instance is cached in the FlowRegistry
at application start-up time, this means the local variable is only assigned once.
Start node
Every flow has an implicit start node which represents the "landing page" i.e. the page which displays the "Start" button. There is no representation of this start node in the flow definition. Clicking the "Start" button on the "landing page" takes you to the page for the first question node (see below).
Also see the documentation for landing page templates.
Question nodes
There is an implicit assumption that the question definition which appears first in the flow definition is the first question of the flow. All other "routing" between nodes is done explicitly in a single next_node
block per question node.
By convention, question nodes are usually listed roughly in the order that a user would visit them. Although this isn't always straightforward when there are multiple paths through the flow.
Since all "routing" is done explicitly within next_node
blocks, the order of the question definitions (other than the first one) is functionally unimportant.
By convention, all question nodes are defined before any of the outcome nodes. However, again the order is functionally unimportant.
Question nodes are defined by calls to one of the various question-type methods. Since the value of self
at the "top-level" within the #define
method is an instance of the flow, these question-type methods are defined on the SmartAnswer::Flow
base class.
For example:
class ExampleSmartAnswerFlow < SmartAnswer::Flow
def define
# self = instance of SmartAnswer::Flow
# metadata specified here
radio :question_key do
option :option_key_1
option :option_key_2
# optional blocks specified here
next_node do
# routing logic specified here
end
end
end
end
See In-question blocks below for more information on the optional blocks mentioned in a comment in the example.
Scope
The value of self
inside the question node definition block (but outside other blocks) is the instance of the relevant question node. Thus in the example above, the calls to #option
and #next_node
are made on the instance of the question node.
In the same way that there is a single instance of each flow class, there is only ever a single instance of each question definition in the system at any one time i.e. a flow instance has a fixed set of question node definition instances.
Since the same instance of a question definition is used across multiple requests, it's important that no request-specific state is stored on them.
The value of self
inside the next_node
blocks (or the other in-question blocks) is an instance of SmartAnswer::State
(see below), i.e. not an instance of a question node.
def define
# self = instance of SmartAnswer::Flow
value_question :question_key do
# self = instance of SmartAnswer::Question::Value
on_response do |response|
# self = instance of SmartAnswer::State
end
next_node do
# self = instance of SmartAnswer::State
end
end
end
Execution
The code inside the question node definition block (but outside other blocks) is executed at flow definition time, not at request time. However, the code inside the next_node
blocks (or the other optional blocks) is executed at request time, not at flow definition time.
def define
# executed at flow definition time
value_question :question_key do
# executed at flow definition time
on_response do |response|
# executed at request time
end
next_node do
# executed at request time
end
end
end
State
The state object is intended to store all request-specific state, keeping that away from the instance of the flow which is reused across multiple requests. SmartAnswer::State
inherits from OpenStruct
and uses BasicObject#method_missing
to allow arbitrary "state variables" to be written and read.
state = SmartAnswer::State.new(:first_node)
state.example_state_variable = 123
state.example_state_variable # => 123
Since the request path only includes the user's responses and not the question keys, and the app is stateless, every request has to be processed by walking through the question definition nodes starting at the first one. As a request is processed, the state is duplicated using Object#dup
in each "transition" to a new node.
state = SmartAnswer::State.new(:first_node)
state.example_state_variable = 123
new_state = state.transition_to(:second_node, 'first-response')
new_state.equal?(state) # => false (i.e. they are *different* instances)
new_state.example_state_variable # => 123
It's important to note that Object#dup
does not do a "deep" copy. Thus any "state variables" set on the state which are references to other objects will continue to reference the same instances of those other object - those objects will not themselves be duplicated. The exceptions to this are built-in state variables: accepted_responses
and forwarding_responses
(see below), these are duplicated using Object#dup
in State#initialize_copy
.
state = SmartAnswer::State.new(:first_node)
state.example_state_variable = [1, 2, 3]
new_state = state.transition_to(:second_node, 'first-response')
new_state.example_state_variable # => [1, 2, 3]
new_state.example_state_variable.equal?(state.example_state_variable)
# => true (i.e. they are the *same* instance)
Since a new instance of the state is created for each request, it's not obvious why the state is duplicated in this way. I know that in the past there have been problems with state leaking between requests, so perhaps this was a mistaken attempt at preventing such leakage.
It's possible to view the state when you're running the app in the development environment.
Built-in state variables
-
current_node_name
- symbol key for the node being processed -
accepted_responses
- hash of symbol keys for nodes previously processed with the answer input -
forwarding_responses
- hash of unprocessed user input of questions, used to store answers when returning to previous nodes -
current_response
- the user input provided for the current question if provided -
error
- key for validation error message to display; usually a string (?)
state = SmartAnswer::State.new(:first_node)
# => #<SmartAnswer::State current_node_name=:first_node, accepted_responses={}, forwarding_responses={}, current_response=nil, error=nil>
first_state = state.transition_to(:second_node, 'first-response')
# => #<SmartAnswer::State current_node_name=:second_node, accepted_responses={:first_node=>"first-response"}, forwarding_responses={}, current_response=nil, error=nil>
second_state = first_state.transition_to(:third_node, 'second-response')
# => #<SmartAnswer::State current_node_name=:third_node, accepted_responses={:first_node=>"first-response", :second_node=>"second-response"}, forwarding_responses={}, current_response=nil, error=nil>
Note that some of the application code (e.g. illegal radio response) erroneously sets the error key to the validation error message string. Since this string is not the key to an error message, the default error message is displayed.
In-question blocks
All question definition blocks, must include a single next_node
block. A number of other in-question blocks can optionally be defined by passing a block to any of the following methods on SmartAnswer::Node
& SmartAnswer::Question::Base
: on_response
, validate
and next_node
.
The value of self
inside all these blocks is an instance of SmartAnswer::State
(see above). The code inside these blocks is executed at request time, not at flow definition time.
All blocks of a particular type (within a single question) are executed at particular points in the request processing sequence e.g. all on_response
blocks are executed before all validate
blocks.
The order in which the blocks are defined only affects the order in which they are executed within the group of blocks of the same type e.g. when two on_response
blocks are defined, the one defined first will be executed before the one defined second; however, even if a validate
block is defined before both of these on_response
blocks, it will always be executed after both of them.
The block types are executed in the following order:
Each of these block types and the point at which they are executed is explained in more detail below:
on_response(&block)
- These blocks are intended to be used to store user responses on a
calculator
state variable. They are a relatively new addition to the DSL. - These blocks are called after the question/outcome template has been rendered and after the user response has been parsed from the request path, but before any of the
validate
blocks are executed. - The parsed response is passed to the block as the only argument and by convention is named
response
. - The block return value is not used and no state variable is stored.
The use of these blocks is encouraged; however, they should only ever be used to store a single,
calculator
state variable (in the first question definition); otherwise they should only be used to store user responses on thatcalculator
object.
validate(message_key, &block)
- These blocks are intended to be used to validate the user response.
- These blocks are executed after all the
on_response
blocks have been executed and before thenext_node
block is executed. - The parsed response is passed to the block as the only argument and by convention is named
response
. - If the block return value is truth-y, then no action is taken.
- If the block return value is false-y, then:
- A
SmartAnswer::InvalidResponse
exception is raised with themessage_key
set as the exception message. - This exception is handled within the app and prevents the transition to the next node.
- The
message_key
from the exception message is set on the built-in state variable,error
. - When the question template is re-rendered, the
error
state variable is used to lookup the appropriate validation error message in the question template.
- A
The use of these blocks is encouraged. However, they should call
valid_xxx?
methods on thecalculator
state variable and not rely on theresponse
argument passed into the block.
# Good
validate :error_outside_range do
calculator.valid_weekly_amount_in_range?
end
# Bad
validate do |response|
calculator.valid_age?(response)
end
next_node(&block)
- There must only be one of these blocks per question definition.
- This block is intended to determine which node comes next based on the user responses so far.
- These blocks are executed after all the
validate
blocks have been executed. - The built-in state variables,
path
,current_node
&responses
, are updated if this block returns successfully. - The block return value must be the result of calling the
#question
or#outcome
methods passing in the key of the next node - see the next node documentation for more details.
The use of this block type is required. However, it should call methods on the
calculator
state variable and not rely on theresponse
argument passed into the block.
Further information
See the documentation on storing data.
Templates
See the documentation for question templates.
Outcome nodes
These are very similar to question nodes. There should never be a response associated with an outcome node. Having said that, the following methods are all technically available within the node definition, because they are instance methods on SmartAnswer::Outcome
(or its superclasses):
If any attempt is made to process a response when the current node is an outcome node (e.g. by hacking the URL path), an exception will be raised.
Templates
See the documentation for outcome templates.