initial commit

2025-10-02 04:50:44 +00:00 · 2025-05-29 13:12:11 -04:00 · 2025-05-29 13:12:11 -04:00 · e0d5c84451
commit e0d5c84451
parent 2f3ebd6693
3413 changed files with 794524 additions and 1 deletions
--- a/docs/adr/2023-08-21_encrypting-taxpayer-artifacts.md
+++ b/docs/adr/2023-08-21_encrypting-taxpayer-artifacts.md
@ -0,0 +1,35 @@
+# Encrypting taxpayer artifacts
+
+Date: 08/21/2023
+
+## Status
+
+Approved
+
+## Context
+
+Our system stores taxpayer artifacts in object storage (AWS S3). These artifacts include files like PDFs of completed tax returns and XML bundles for submission to the Modernized e-File (MeF) system that contain sensitive taxpayer data in the form of personally identifiable information (PII) and federal tax information (FTI).
+
+AWS S3 provides server-side encryption to protect the contents of objects on disk. However, anyone with access to the S3 bucket is able to view the plaintext contents of files. To mitigate the impact of a breach or leak—e.g., a misconfiguration resulting in public access to the S3 bucket or improper file handling by administrators—we want an approach that provides an additional layer of encryption to sensitive files prior to their storage.
+
+## Decision
+
+We will use [Amazon S3 Encryption Client](https://docs.aws.amazon.com/amazon-s3-encryption-client/latest/developerguide/what-is-s3-encryption-client.html) to encrypt taxpayer artifacts containing sensitive information before storing them in object storage. This aligns with [our decision to use S3 for all artifact storage](adr_taxpayer_artifact_storage.md).
+
+We will configure the client to use a symmetric AES-GCM 256-bit wrapping key that is created and managed by AWS Key Management Service (KMS). The client library will use this wrapping key as part of an [envelope encryption scheme](https://docs.aws.amazon.com/amazon-s3-encryption-client/latest/developerguide/concepts.html#envelope-encryption) for encrypting each artifact with a unique data key. The wrapping key will be used only for artifact storage, allowing these encryption and decryption operations to be auditable independently from other KMS operations throughout the system.
+
+## Consequences
+
+- We use an open-source, actively maintained encryption library that limits the need for custom implementation of cryptographic details
+- The envelope encryption approach used by the library is conceptually similar to our chosen approach for column-level encryption
+- If needed, we can use this same library for [encrypting the content of messages sent to our queues](https://aws.amazon.com/blogs/developer/encrypting-message-payloads-using-the-amazon-sqs-extended-client-and-the-amazon-s3-encryption-client/)
+- We already utilize many AWS services and libraries, and as such this decision does not expand the list of third-party providers we rely on. Conversely, this further couples our implementation to AWS; if we move to a new cloud provider we will need to identify a suitable replacement library.
+- We can further minimize overhead by enabling automatic key rotation on the wrapping key in AWS KMS
+- Developers must use this library exclusively when storing and retrieving taxpayer artifacts containing sensitive information
+- Plaintext taxpayer artifacts will not be readily available for debugging and troubleshooting. We will need to plan additional features for this functionality and ensure they are implemented in ways that support operational needs without undermining our security and privacy posture.
+- [Additional configuration](https://docs.aws.amazon.com/amazon-s3-encryption-client/latest/developerguide/features.html#multipart-upload) may be necessary if we store files greater than 100MB
+
+## Resources
+
+- [ADR for storing artifacts in S3](docs/adr/adr_taxpayer_artifact_storage.md)
+- [Amazon S3 Encryption Client](https://docs.aws.amazon.com/amazon-s3-encryption-client/latest/developerguide/what-is-s3-encryption-client.html)
--- a/docs/adr/2023-10-23_kms-key-management.md
+++ b/docs/adr/2023-10-23_kms-key-management.md
@ -0,0 +1,60 @@
+# AWS KMS Key Management
+
+Date: 10/23/2023
+
+## Status
+
+Draft
+
+## Context
+
+[ADR: Encrypting Taxpayer Data](adr_encrypting-taxpayer-data.md) specifies our strategy for encrypting taxpayer data.
+
+Using AWS KMS keys, our system encrypts sensitive taxpayer data. Since the keys and their access makes 
+it possible to encrypt and decrypt the system's production data, we want to specify the standard operating procedure for
+brokering, configuration, access management, and rotation for these keys.
+
+## Decision
+
+### Key Creation and Configuration
+
+The IRS team overseeing our application infrastructure will be responsible for setting up the KMS keys for each environment/region,
+and managing access to those keys.
+
+We use KMS Customer Managed Keys with usage of `ENCRYPT_DECRYPT` and the encryption algorithm of 
+`SYMMETRIC_DEFAULT` (which is a 256-bit AES-GCM algorithm). 
+Additionally, the keys are set up as multi-region to support the IRS environments' ACTIVE-ACTIVE application infrastructure.
+
+Our application(s) using the KMS encryption keys are configured to specify the key ARN, in keeping with AWS's recommendations 
+for using [strict mode](https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/best-practices.html#strict-discovery-mode).
+ 
+### Access Management
+* The least amount of privilege is given to application users of the KMS keys, which are governed by a combination of key 
+  policies and IAM policies managed by IRS administrators.
+* Direct File has access to the encryption keys for the following actions:
+  * GenerateDataKey
+  * kms:Encrypt
+  * kms:Decrypt
+* The IRS's AWS cloud administrators have full key access rights.
+
+### Rotation
+We have automatic key rotation in KMS with the default schedule for customer managed keys.
+
+KMS wrapping key rotation will happen automatically without intervention or need to update our configuration, as the properties 
+of the key never change (only the cryptographic material/secrets).
+
+(from https://docs.aws.amazon.com/kms/latest/developerguide/rotate-keys.html):
+  > When you enable automatic key rotation for a KMS key, AWS KMS generates new cryptographic material for the KMS key every year. 
+  > AWS KMS saves all previous versions of the cryptographic material in perpetuity so you can decrypt any data encrypted with that KMS key. 
+  > AWS KMS does not delete any rotated key material until you delete the KMS key. You can track the rotation of key material for your KMS keys in Amazon CloudWatch and AWS CloudTrail.
+  
+## Consequences
+- We follow recommended practices so that we have confidence in the maintainability and validity of our key management strategies
+- We limit access to our KMS encryption keys to only our application and IRS cloud administrators
+- With automated key rotation, rotations are passive to our application, with no configuration updates needed.
+- Importantly, if keys are compromised, automated key rotation will not mitigate the issue.
+
+## Resources
+
+- [ADR for encrypting taxpayer data](adr_encrypting-taxpayer-data.md)
+- https://docs.aws.amazon.com/kms/latest/developerguide/rotate-keys.html
--- a/docs/adr/2024-11-21_production-branching-strategy.md
+++ b/docs/adr/2024-11-21_production-branching-strategy.md
@ -0,0 +1,178 @@
+# Github Branching Strategy for FS25 Production Development
+
+    Status: Accepted
+    Date: 2024-11-21
+    Decision: Option 1 - Set `main` as "future features" branch for all TY25+ work and create a long-term `production` TY24 branch and merge from `production` into main
+
+## Context and Problem Statement
+
+[//]: # ([Describe the context and problem statement, e.g., in free form using two to three sentences. You may want to articulate the problem in form of a question.])
+
+This ADR documents our short-term Github branching strategy for filing season 2025 (FS25), starting in January 2025.
+
+We don't currently support multiple tax years simultaneously in Direct File and all active tax logic in the repo at any point in time is currently for a single tax year. In the pilot year, filing season 2024 (FS24), we only supported tax year 2023 (TY23) and in the second year starting in January 2025 (FS25), we plan to only support tax year 2024 (TY24). We also have an additional product goal of keeping Direct File open year-round, but only supporting one tax year at a time for now. Supporting multiple tax years concurrently in production is on the roadmap for TY26/FS27.
+
+Our development goals in the second year of Direct File are to simultaneously support two modes of operating within our single Github repository:
+1. Support the active tax logic in production for TY24, fix bugs and potentially add small tax scope items that should be deployed to taxpayers in production during FS25
+1. Start development on future scope beyond this year - Factgraph and foundational improvements and future TY25 tax logic that should not be deployed to taxpayers in production until after FS25
+
+## Decision Drivers
+
+Some factors considered in the decision
+- Ease of use for developers: how can we make it easy for developers to use the right branch by default? A majority of the team will focus on active production TY25 tax scope and a minority will be working on future scope
+- Complexity for devops: understand the workload and changes required for devsecops with each strategy
+- Error workflow: if a PR is merged to the wrong branch, how easy is the workflow for fixing the error?
+- Minimize git history issues: consider alternatives to the cherry-picking approach we took last year in order to make production git history cleaner and more accurate. Avoid situations where cherry-picked commits do not include important prerequisite changes.
+
+## Considered Options
+
+### Option 1: Set `main` as "future features" branch for all TY25+ work and create a long-term `production` TY24 branch and merge from `production` into main
+
+
+#### Branches
+- `main`: The evergreen branch consisting of both future features and production TY24 code. This branch is *not* to be deployed to taxpayers in production during FS25
+- `production`: The long-running branch consisting of only production TY24 code, to be deployed to taxpayers in production during FS25 on a weekly basis via release branches.
+- Pre-commit hooks and branch protection rules will also need to be configured on this new branch enforcing code review, CI tests, scans and more
+
+#### Releases and Deployment
+- Cut releases from `production`
+- Deploy from `production`
+- Reserve the TEST environment for regular deployments from `main` for testing future features
+- Continue to use the PRE-PROD environment for `production` branch testing
+- Continue these two branches throughout the year
+- When it is time to cutover to TY25 (December 2025 or January 2026), fork a new `production` branch from `main` and continue the same processes for the next tax year (or move to Option 3 or another long-term branching strategy)
+
+#### Developer workflows
+
+The first step is to identify whether the work is destined for the current `production` branch, or whether it is work that is only for future tax years.
+
+##### For production TY24
+1. Check out and pull `production` branch
+    - `git checkout production`
+    - `git pull`
+1. Create/check out new branch for feature/fix work
+    - `git checkout -b branch-for-ticket`
+1. Create a PR for changes into `production` branch
+1. Squash and merge PR into `production` branch
+1. Create a second PR from the `production` branch into `main` and resolve conflicts if needed. We are unable to automate this step at this time, but would like to merge into `main` on every PR in order to make the conflict resolution easier.
+   - Note: Regular merge (not "Squash and Merge") is recommended here 
+
+##### For future features
+1. Check out `main` branch
+1. Make PR against `main` branch
+1. Merge into `main` branch
+
+#### Error workflow
+##### If a change intended for `production` is accidentally made on `main`
+1. Revert PR from `main`
+1. Make new PR against `production` and follow developer workflow for production TY24 above
+
+##### If a change intended for `main` is accidentally made on `production`
+1. Notify release manager in the event the change might be destined for deployment and coordinate as needed
+1. Revert PR from `production`
+1. Make new PR against `main` and follow developer workflow for production TY24 above
+
+### Option 2: Set `main` as the production TY24 branch and create a long-term "future features" branch for all TY25+ work and merge it back into `main`
+
+#### Branches
+- `main`: The main production branch consisting of only production TY24 code, to be deployed to taxpayers in production during FS25 on a weekly basis via release branches
+- `future`: The long-running branch consisting of only future features TY25+ code, *not* to be deployed to taxpayers in production during FS25
+
+#### Releases and Deployment
+- Cut releases from `main`
+- Deploy from `main`
+- Reserve a lower environment for regular dpeloyments from `future` for testing future features
+- Continue these two branches throughout the year
+- When it is time to cutover to TY25 (December 2025 or January 2026), fork a new `main` branch from `future` and continue the same processes for the next tax year
+
+#### Developer workflows
+
+The first step is to identify whether the work is destined for the current `production` branch, or whether it is work that is only for future tax years.
+
+##### For production TY24
+1. Check out `main` branch
+1. Make PR against `main` branch
+1. Merge into `main` branch
+1. Create a second PR from the `main` branch into `future` and resolve conflicts if needed. We are unable to automate this step at this time, but would like to merge into `main` on every PR in order to make the conflict resolution easier.
+   - Note: Regular merge (not "Squash and Merge") is recommended here 
+
+##### For future features
+1. Check out `future` branch
+1. Make PR against `future` branch
+1. Merge into `future` branch
+
+#### Error workflow
+##### If a change is accidentally merged into `future`
+1. Revert PR from `future`
+1. Make new PR against `main` and follow developer workflow for future features
+
+##### If a future feature change is accidentally merged into `main`
+1. Notify release manager in the event the change might be destined for deployment and coordinate as needed
+1. Revert PR from `main`
+1. Make new PR against `future` and follow developer workflow for future features
+
+### Option 3: Continue with one `main` branch and use feature flags to control production functionality
+Two technical prerequisites are required for this option that do not exist today:
+1. Ability to have multiple tax years exist in the codebase simultaneously because the repository will not have a way of separating one tax year from another via branches. This functionality is currently planned for 2026, not 2025.
+1. Robust feature flagging: Because there will only be one single `main` branch that is deployed to production, we need to have a feature flag system we are confident in to ensure that future features do not accidentally launch to taxpayers ahead of schedule when they are not fully complete.
+
+#### Branches
+- `main`: The evergreen branch consisting of both future features and production TY24 code. This branch is to be deployed to taxpayers in production during FS25 on a weekly basis via release branches
+
+#### Releases and Deployment
+- Cut releases from `main`
+- Deploy from `main`
+- When it is time to cutover to TY25 (December 2025 or January 2026), no actions are necessary because there is no reconciliation / update of branches needed, there is only one `main` branch
+
+#### Developer workflows
+##### For production TY24
+1. Check out `main` branch
+1. Make PR against `main` branch
+1. Merge into `main` branch
+
+##### For future features
+1. Check out `main` branch
+1. Make PR against `main` branch with a feature flag that controls whether taxpayers will have access to the feature in production (for future features, this should likely be set to OFF)
+1. Merge into `main` branch
+
+#### Error workflow
+##### If a change is accidentally merged into `main`
+1. Revert PR from `main`
+1. Make new PR against `main` and follow developer workflow for production TY24 above
+
+## Decision Outcome
+
+Chosen option: Option 1: Set `main` as "future features" branch for all TY25+ work and create a long-term `production` TY24 branch and merge from `production` into main in order to minimize the risk of future featues merging into the `production` branch by mistake. This option will maintain stability in `production` for our taxpayers, but will result in slightly more friction in the TY24 developer workflow.
+
+## Pros and Cons of the Options
+
+### Option 1: Set `main` as "future features" branch for all TY25+ work and create a long-term `production` TY24 branch and merge from `production` into main
+`+`
+- Lower risk of future features slipping into production because `main` as "future features" branch is the default branch
+
+`-`
+- Changes to devops workflow for releases and deployment are required because we will now be deploying from `production`
+- Higher risk of TY24 changes being made on `main` instead of `production` because `main` is still the default. Developers working on TY24 will need to consciously switch over to the `production` branch, but mistakes on the `main` branch will not lead to taxpayer impact since `main` is not being deployed to production
+- Breaks the principle that `main` is always deployable
+
+### Option 2: Set `main` as the production TY24 branch and create a long-term "future features" branch for all TY25+ work and merge it back into `main` at the end
+
+`+`
+- `main` continues to be the source of truth for production code and is deployable to production
+- No change to devops workflow for releases and deployment, we continue to deploy from `main`
+
+`-`
+- Higher risk of future features accidentally slipping into `main` since it is still the default branch, which may lead to production issues with taxpayers
+- End of season switchover is a little wonky as `future` becomes `main` and a new `future` is created
+
+### Option 3: Continue with one `main` branch and use feature flags to control production functionality
+`+`
+- No change to developer workflow in either case of making production TY24 changes or future feature changes
+- No change to devops workflow for releases and deployments
+
+`-`
+- Will require additional engineering lift to complete the technical prerequisites of supporting multiple tax years at once and building out the feature flag system, that is unclear we have time for before January 2025
+
+## Background information
+- Our team is currently using [Github flow](https://docs.github.com/en/get-started/using-github/github-flow)
+- Some additional git workflows considered for inspiration were [Gitflow](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow) and [Gitlab flow](https://about.gitlab.com/topics/version-control/what-is-gitlab-flow/), neither of which perfectly met our needs in a simple way
--- a/docs/adr/adr-css-preprocessor.md
+++ b/docs/adr/adr-css-preprocessor.md
@ -0,0 +1,44 @@
+# ADR: CSS Preprocessor
+Written: 20Jun2023
+
+## Background
+
+CSS has some major problems. Namely, it:
+1. lacks variable names and the ability to import variables from other libraries.
+1. lacks nesting and other hierarchical structures that are common in components.
+1. is global (cascading!), leading to naming conflicts, difficult dead code detection, and difficult maintainbility in large code bases.
+1. is constantly updated, and can have different implementations and minor differences between different browsers
+
+To avoid these issues, while adding additional features, most of the web development community uses one or more forms of CSS Preprocessor, preprocessing a superset of CSS into the CSS that will eventually reach the users' browsers. 
+
+### Required Features
+1. Must be able to implement and rely on the US Web Design System (USWDS)
+    1. Specifically, we need to interact with and rely on https://github.com/trussworks/react-uswds and https://github.com/uswds/uswds. Both of these systems use SASS and export SASS variables.
+1. Must be able to scope CSS, eliminating global classes
+1. Must help, not hinder, building an accessible product. 
+1. Must build output that runs without issue on popular phones and computers
+1. Must support mobile development and responsive design
+
+### Evaluation Criteria
+1. Interoperability with existing USWDS libraries
+1. Reputation within the frontend web development community
+1. Output to responsive, mobile-friendly, and accessible design.
+
+## OSS Landscape
+There are a few CSS popular preprocessors:
+1. [SASS](https://sass-lang.com/), per their own marketing speak, defines themselves as "the most mature, stable, and powerful professional grade CSS extension language in the world." Sass has ~10m weekly downloads on NPM and is increasing in number of downloads.  
+1. [LESS](https://lesscss.org/) is the main competitor to SASS, and contains many of the same features.  Less has ~4m weekly downloads on NPM and is flat in number of downloads. 
+1. [PostCSS](https://postcss.org/) converts modern css into something most browsers can understand, placing polyfills in place. PostCSS is not a separate languagea -- it's a compile step like babel for greater compatibility. Stylelint and other tools are built on PostCSS
+1. [CSS Modules](https://github.com/css-modules/css-modules) provide local scoping for CSS. Styles are defined in a normal css/less/sass file, then are imported into the React components that use those classes. 
+1. [Tailwind](https://tailwindcss.com/) is noteable for being slightly different than other popular CSS frameworks, and is a css framework -- rather than a preprocessor -- that encourages stylistic, rather than semanetic, classnames directly in markup. It's gaining popularity rapidly (4.7m downloads/wk, up from 3m downloads/wk a year ago). However, it would be hard to integrate with USWDS. 
+1. [Stylelint](https://stylelint.io/) is a CSS linter used to prevent bugs and increase maintainability of CSS
+
+
+## Decision
+We should run the following CSS Preprocessors:
+1. Our CSS Language should be SASS, given its popularity and interoperability with USWDS. Most critically, we can import variable names from USWDS. 
+1. We should additionally use SASS Modules to scope our CSS to their components, avoiding global cascades. 
+1. We should use stylelint with its recommended config. We should also use the [a11y](https://www.npmjs.com/package/@ronilaukkarinen/stylelint-a11y) plugin experimentally to see if it helps us with accesibility (though noting that it seems not well supported and we should be willing to drop it).
+1. Following our SASS compilation step, we should run postcss to get down to a supported list of browsers that we support via [browserlist](https://github.com/browserslist/browserslist#readme)
+
+Unsurprisingly, when developing for these criteria (and with a sigh of relief that USWDS uses SASS), this is the same CSS stack used by [Create React App](https://create-react-app.dev/docs/adding-a-css-modules-stylesheet). 
--- a/docs/adr/adr-email-allow-list.md
+++ b/docs/adr/adr-email-allow-list.md
@ -0,0 +1,54 @@
+# Email Allow List for Phase A of Pilot Rollout
+Date: 12/28/2023
+
+## Status
+Complete
+
+## Context
+For the Phase A launch which will be invite-only internal, around 1/29 - early February, we need to restrict access to Direct File using an email allow list. We will be provided a CSV of emails that will need to be checked by Direct File before allowing users to start their tax return. Users will need to authenticate via SADI/ID.me and once they arrive at Direct File, we need to check their SADI email address against our allow list. If the email is not on the allow list, the user should see an error message. The Direct File team does not have access to change anything in PROD, so we will use the following process to coordinate with TKTK, the IRS team who manages the PROD environment. 
+
+### Requirements
+- Handle ~100s of emails on an allow list
+- Accounts on the email allow list get access to Direct File in Phase A
+- Accounts not on the email allow list see an error message when they try to go to Direct File
+- Email allow list needs to be able to be updated WITHOUT a Prod code deployment
+
+### Background
+The current plan for identifying Direct File pilot users for Phase A:
+1. An email will be sent to IRS Employees asking for interest in participating in the Direct File pilot
+1. IRS Employees will tell us which email to use for the allow list (most likely via email)
+    1. We recommend using personal email, but we can’t be sure someone hasn’t already used their IRS email to sign up for IRS Online Account, and don’t want to ban them from using DF
+1. We’ll collect those emails and turn them into a .csv for the allow list
+1. Employees will be able to use Direct File on campus and during work hours, using IRS machines (some of these employees may not have home computers or modern cell phones)
+
+## Decision
+
+### Email Allow list
+- There will be less than 1,000 emails on the email allow list
+- The email allow list will be saved in the `artifact-storage` S3 bucket already used by DF backend for storing PDF files
+- The format of the email allow list when loaded into S3 will be a CSV of HMAC hashed normalized email addresses and a DF seed in order to prevent plaintext emails to be read
+    - The DF seed will be used for both 
+        1. Hashing the email addresses in the CSV file and
+        2. By the DF application to check whether a logged in user is in the email allow list
+    - Emails will be normalized by performing the following transformations in order:
+        1. Convert to lower case
+        1. Trim any leading or trailing spaces
+- We will update the `users` DB to store a new column representing whether or not the user is allowed to access the application. We will also use this column to determine whether a user is allowed in phase C, separate ADR to follow
+    - The alternative to having a DB record for this was to call the SADI PII service every time which risks creating downstream impacts for SADI
+
+### Direct File Implementation
+- Set up an environment variable `enforce-email-allow-list` to control whether we will enforce the email allow list in each environment.
+- DF Backend app checks environment variable for email allow list. If it is ON, continue:
+- On a regular time interval, say every `15 minutes`, DF backend poll S3 for an updated email allow list file:
+    - If it exists, then load emails into memory
+    - If the file does not exist, deny all access
+- The email allow list contains HMAC hashed email addresses, so DF backend will need to load the user's SADI email and hash it using `HMAC (seed + lowercase(email))` to compare it to values on the email allow list
+- For every request to DF backend, check whether the user is allowed
+     - Check whether user already exists in the DB
+        - If yes: Check the access granted column to see if the user has been allowed
+            - If yes: allow access to DF as normal
+            - If no: backend should respond with a 403 error code with custom message, which the frontend will turn into a user-friendly message
+        - If no: Check the hashed SADI email against the allow list
+            - If yes: mark the access granted column as true for the user
+            - If no: mark the access granted column as false. If the user has been previously allowed, we do not change the access from true to false
+- A code deployment is needed in order to turn off the email allow list and move to the next phase (controlled availability)
--- a/docs/adr/adr-fact-dictionary-tests.md
+++ b/docs/adr/adr-fact-dictionary-tests.md
@ -0,0 +1,98 @@
+---
+# Configuration for the Jekyll template "Just the Docs"
+parent: Decisions
+nav_order: 100
+title: "Fact-dictionary Testing"
+
+# These are optional elements. Feel free to remove any of them.
+status: "Proposed"
+date: "2023-07-21"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Fact-dictionary testing
+
+## Context and problem statement
+
+The core logic for how we calculate taxes is expressed through the Fact Dictionary, a configuration of the Fact Graph akin to a database schema (including something akin to views, via our derived facts). Currently, however, we have no direct way to test these configurations; we can test the client and then get at values, but the dictionary is used in more than just the client, and tests for it don't naturally belong there.
+
+## Desirable solution properties
+
+We'd like the solution to this problem to
+
+1. Create test artifacts that can be verified by non-programmer tax experts
+2. Allow for quick feedback cycles for developers editing the fact dictionary
+3. Alert developers to regressions or new bugs while editing the fact dictionary
+4. Identify calculation differences between the frontend and backend
+
+## Considered options
+
+1. Testing via the client directly, by writing vitest tests
+2. Testing via calls to the spring API
+3. Testing via spreadsheets, using the client test runner as a harness
+4. Tests via spreadsheets, using a new scala app that runs tests against the JVM and Node
+
+## Decision Outcome
+
+Chosen option: Test via spreadsheets that are parsed by the client, using the client test runner as a harness. The spreadsheets will test common and edge tax scenarios, to identify possible errors in the fact dictionary.
+
+In some detail, we will
+- Create a series of workbooks that have a sheet of "given facts" full of paths to writable facts and their values and a sheet of "expected facts" of derived facts and their expected values.
+- Create a test harness (using the JS libraries we use for the client) to watch and execute tests against these spreadsheets, effectively setting everything in the given sheet on a factgraph and then asserting each path in the expected sheet equals its value
+
+### Consequences:
+
+1. Good: If we get the spreadsheet format to be readable enough, we should be able to have the spreadsheets reviewed by someone at the IRS to check for correctness
+2. Good: The client APIs are pretty quick to work with, and the vitest runner is familiar and fun
+3. Good: Most of the fact dictionary edits happen while building out client code, so the feedback happens where we want it, during the normal development workflow
+4. Good: Assuming we get a format we like, it wouldn't be too outlandish to run the same tests against the JVM by rewriting the test harness --- creating the test cases is the really hard part
+4. Less good: We're going to be abusing vitest
+5. Less good: We won't identify differences between the backend and frontend
+6. Neutral but I don't like it: coming up with examples is both subjective and difficult, since you have to do taxes correctly
+
+### Discussion
+
+During the discussion in despair, we decided that looking for differences in the frontend and backend was distinctly secondary and could come later.
+
+Chris made two helpful suggestions:
+
+- We can mine VITA Volunteer Assistors test ([pub 6744](https://www.irs.gov/pub/irs-pdf/f6744.pdf)) for scenarios to test. Sadly, there is no answer key included.
+- We will likely want to be able to base scenarios on one another (which I called spreadsheet inheritance to make everyone sad), so that we can test variations
+
+## Pros and Cons of other options
+
+### Testing via the client directly, by writing vitest tests
+
+#### Pros
+
+- Really straightforward, since we just write some tests
+- Gets us tests
+
+#### Cons
+
+- Not easily reviewable by non-programmers
+- Checking the tests on the JVM requires rewriting all of them
+
+### Testing via calls to the spring API
+
+#### Pros
+
+- Probably good tooling exists --- it's spring
+- Checks the test cases against the JVM
+
+#### Cons
+
+- Requires new Java APIs just for this
+- Requires running the backend just to run the tests, which isn't where most fact dictionary changes happen
+
+### Tests via spreadsheets, using a new scala app that runs tests against the JVM and Node
+
+#### Pros
+
+- All the pros of the chosen approach, plus gives us Node and JVM testing
+
+#### Cons
+
+- Just literally harder --- we have a lot of what we need in the client already, we'd be spinning up another scala app just for this
+- Outside of the usual workflow, since devs would need to run a separate scala build tool just for testing fact dictionary changes
+
--- a/docs/adr/adr-fact-modules.md
+++ b/docs/adr/adr-fact-modules.md
@ -0,0 +1,126 @@
+# ADR: Fact Dictionary Modules (and how to test them)
+DATE: 11/27/2023
+
+## TL;DR
+
+Creating modules for facts should let us move from an untestable amount of complexity and interactions between facts to well defined boundaries that developers can reason about and test.
+
+## Background
+
+### The hodgepodge of facts and completeness
+
+At the moment we have 451 derived facts and 261 writable facts in the fact dictionary. Any derived fact can depend on any other fact (written or derived from any section of the app), and any fact can be in any state of completeness, which can bubble through derived facts, leading to unexpected outcomes. In particular, bugs or errors written in the "You and Your Family" section might not manifest themselves until the final amount screen. These bugs are common and creating a game of whack-a-mole, and we need to replace whack-a-mole with a stronger system that prevents moles in the first place.
+
+Additionally, the fact dictionary is hard to reason about in its current form due to these sometimes-odd relationships. A developer modifying social security may need to reason about a fact from the filer section that may or may not be intended for usage outside of that section.  
+
+In a world where any node can depend on any other node, we have to test everything, everywhere, all of the time, instead of being able to separate our tests into logical chunks and the boundaries between those chunks. 
+
+The current state of our fact dictionary looks like ![Current fact graph state, looking complicated!](./supporting-files/adr-namespacing-11-26-23-dictionary.svg)
+
+### Example bugs caused by untested interactions between sections
+
+These are a few examples of where bugs in some sections have manifested in strange ways in the app, but 
+
+1. When Married Filing Separately, (MFS), we never collected `secondaryFiler/canBeClaimed` (because it should not matter), but this led to a person getting stuck at the deductions section since `/agi` would be incomplete. We should have made sure that `/filersCouldntBeDependents` was always complete by the end of the spouse section .
+2. When filing as either married status, and the filers did not live apart for at least six months, our social security section would break because we did not know if the filers lived apart all year. . 
+3. Any incompleteness bug in the app will leave `/dueRefund` incomplete (based on the tax calculation being incomplete), and the user will break on the 'tax-amount-number' screen.
+
+### Testing for completeness
+
+We have [completeness tests](../direct-file/df-client/df-client-app/src/test/functionalFlowTests/checkFlowSetsFacts.ts) that can test that a section of the flow will leave certain facts as complete. However, it's unclear which facts we should test for completeness at which points because there are no well defined boundaries between sections. For instance, in example (1) above a test that `/filersCouldntBeDependents` was complete by the end of the spouse section would have prevented the bug, but nobody knew to write that test.
+
+Beyond that, even if we get this right at some point, there's no guarantee it will remain correct, or that developers writing downstream tests will always respect  a `/maybeCompleteFact` that gets a default value in derived fact `/alwaysCompleteFact`, there's no guarantee that a downstream fact will correctly use `/alwaysCompleteFact` instead of accidentally using the maybe complete version. 
+
+## Proposed solution -- setting healthy boundaries via modules
+
+### Goals
+
+- When working on a section of the app (e.g. a credit like the EITC, or an income type like an SSA-1099), a developer should only need to reason about the facts in that section, and other well-tested, known-to-be-complete facts from other sections. 
+- Facts that are relied on across sections should be a defined set with high test coverage, and each fact should have a point in the flow where they are known to be complete. 
+- We should be able to think separately about testing facts within a section, and facts that cross boundaries between sections similar to how we can think about testing implementations and API boundaries.
+- Find a way or ways to reduce the risk of unintended consequences when adding, changing, or deleting facts by limiting the possible dependencies of a fact
+
+### Proposal
+
+1. (must) We introduce two new concepts to our facts:
+    1. A module defines a logical section of the fact dictionary, such as the EITC, the educator deduction, or social security income. It *may* align with a section of the flow, but that is not required, and the fact dictionary continues to not know about the existence of the flow. All facts should exist in a module. During a transitionary period, `ty2022` will be a module. 
+    2. Facts default to being private to their own module. They can be exported from their module for two purposes -- they can be exported for `downstreamFacts` and for `mef`. Our build will fail if a fact in one module depends on a private fact from a different module. 
+2. We break `ty2023.xml` into multiple files -- one file per module. The module name will be equivalent to the filename -- e.g. `eitc.xml` will produce the module `eitc`. This helps developers understand the scope of their current changes, and is consistent with modularity patterns in programming languages. It additionally allows test files to align their naming with the section of the fact dictionary that they test, and should be helpful towards us tracking per-module test coverage.
+3. We test module dependencies at build time, but we will strip away the module information for runtime -- fact dictionary paths are complicated, and December 2023 is not a good time to mess with them. The runtimes on the frontend and backend will continue to run with a full list of facts, like they currently do.
+3. (should) We continue writing our fact dictionary tests in vitest tests since we have many existing tests in that framework that we can re-use, and we can easily run these tests. In the future, we may change these to a more language-angostic format such as a spreadsheet. 
+
+#### Example
+```xml
+spouse.xml
+    <Fact path="/writableLivedApartAllYear">
+      <Name>Writable lived apart from spouse the whole year</Name>
+      <Description>
+        Whether the taxpayer lived apart from their spouse for the whole year.
+        Use `/livedApartAllYear` for downstream calculations
+      </Description>
+
+      <Writable>
+        <Boolean />
+      </Writable>
+    </Fact>
+
+
+    <Fact path="/livedApartAllYear">
+      <Name>Lived apart from spouse the whole year</Name>
+      <Description>
+        Whether the taxpayer lived apart from their spouse for the whole year.
+        Takes into account that we only ask this question if the TP and SP
+        lived apart the last six months.
+      </Description>
+      <Export downstreamFacts="true" />
+
+      <Derived>
+        <All>
+          <Dependency path="/writableLivedApartAllYear" />
+          <Dependency path="/livedApartLastSixMonths" />
+        </All>
+      </Derived>
+    </Fact>
+
+...
+eitc.xml
+    <Fact path="/eligibleForEitc">
+    ...
+      <Dependency module="spouse" path="/livedApartAllYear" /> 
+    ...
+    </Fact>
+```
+
+
+And with that, our fact dictionary should go from the complicated "any node to any node" setup, to instead look more like this graph, where the exported facts are the edges between namespaces, and we can write tests to check internal to a namespace, or using the boundary between namespaces.
+
+Instead of having to test something as complicated as the above fact graph, we'll instead have something that looks closer to ![Simplified state that collects facts into nodes!](./supporting-files/adr-namespacing-proposal.svg)
+
+Whereas testing the above any node to any node setup seems impossible, this looks like discrete chunks we can think about. 
+
+### Build Time Checks
+
+We should build the following tests into our pre-merge CI pipeline:
+
+- (must) A derived fact in a namespace only depends on facts in its own namespace or facts exported from other namespaces. For now I propose this being static analysis, but in the future, the fact graph itself could know about namespaces and do runtime checking. 
+- (should) A fact that is exported from a namespace is tested to be complete by some point in the flow, before it starts getting used by other namespace (e.g. after you complete the spouse section `/filersCouldntBeDependents` should always be complete). We can use our existing completeness tests for this, but modify them to test "All exported facts from a namespace" rather than a manually created and maybe-out-of-sync list defined in a typescript file. 
+- (should) Exported facts should be tested for correctness within their own section, and then can be relied on to be correct outside of their section (e.g. we've tested the various combos of `/filersCouldntBeDependents` for each filing status and writable fact scenario. There's no need for the deductions section to check every combo of the tree of facts upstream of `/filersCouldntBeDependents`).
+- (must) The MeF integration should only use variables that are exported from a module for MeF. We may not immediately use this functionality, as it requires additional work (MeF sometimes intentionally depends on incomplete varaibles)
+
+
+### Future work
+After this we can investigate any of the following:
+1. Using the new, better-defined inputs into a section to inform test coverage of that section (e.g. better understanding that TIN/SSN validity could affect W2 completeness could have prevented as bug TKTK
+2. Moving fact dictionary tests from typescript to a language agnostic format
+3. Making the fact graph itself aware of namespaces and adding runtime checks. 
+4. Measuring test coverage of each derived fact (I don't know how to check that we've actually hit every statement in a switch, etc.)
+6. Perf improvements to the NP-hard flow tests
+
+But most of the above will be a lot easier if we understand the boundaries we can test instead of trying to test everything, everywhere, all of the time. 
+
+
+## Changelog
+1. (12/4/23) Modified the term "namespace" to "module"
+2. (12/4/23) Specified file name as module names
+3. (12/4/23) Defaulted facts private to their module
+4. (12/4/23) Specified how modules will be imported by Dependency blocks
--- a/docs/adr/adr-frontend-architectural-constraints-spring-2023.md
+++ b/docs/adr/adr-frontend-architectural-constraints-spring-2023.md
@ -0,0 +1,48 @@
+# ADR: Frontend Architectural Constraints for the Spring 2023 Client
+Date: 10 May 2023
+
+## Background
+
+In April of 2023, the direct file team decided to prototype a new frontend (different from the developed for usability
+tests earlier in the year) that has the following goals:
+
+1. The main source of state/truth is the FactGraph, as provided by the transpiled scala code
+2. The screens for the application are driven by a static configuration that maps nodes in the Factgraph to fields on the screen
+   1. Screens can be conditionally displayed, and the condition is calculated on the basis of data in the Factgraph
+   2. Groups of screens, called collections, can be repeated - most commonly when a tax return has more than one W2 or Filer
+3. All tax calculation is handled by the Factgraph
+4. The structure remains a SPA, to take advantage of edge processing of end users when doing Factgraph calculations
+5. We can support multiple languages, and translation screens can include user-entered data
+6. The application will meet the requirements of WCAG 2.0 and relevant 508 Standards
+
+
+## Architectural properties
+
+### Major pieces
+
+Work to produce a client with the above properties has been prototyped in df-client as a React app. This app contains some early-stage design decisions:
+- a **flow** from a tsx (soon to be xml) file, defines the order of questions in the interview and fields to be displayed on each screen
+- the **fact dictionary**, serialized as JSON by the backend, defines the shape of the Factgraph, how facts are calculated/derived, and what the constraints on given facts are
+- **translation strings** using i18next/i18next-react are modified to perform string interpolation from the Factgraph, as well as predetermined tax information via user interface text.
+
+The basic mechanics for the client's operation look as follows:
+
+1. On startup, the client initializes a mostly-empty factgraph using the JSON-serialized fact dictionary. This is exported as a global singleton object and is **the** mechanism for conveying tax-data-related state across screens/parts of the application. State that is unrelated to the user's tax data is stored in session storage and will be discussed in later ADRs.
+2. A `BaseScreen` component reads the **flow** configuration to configure screens. Each screen contains either text (called `InfoDisplay`) or some number of fields (generically called `Facts`) gathering information from the user from the Factgraph. 
+   1. Each field is responsible for writing its own data into the Factgraph, once those data are valid. Additionally, fields read their initial data from the Factgraph, which allows users to go back to previous screens and edit.
+   2. The Screen saves the user's data when the taps "Save" and transmits the writable facts back to the server.
+3. The URL's within the interview portion of the application are formatted using a nested hierarchy, where each screen's URL reflects the current position in the flow.
+4. `Checklist` component reads the **flow** and builds itself based on the category/subcategory/screen/fact configurations therein. The checklist computes the user's progress through the application by looking up which facts are complete and thereby which screens have all of their facts complete. These calculations are made based on the state of the local factgraph.
+
+### Constraints
+
+As we've built out the application, we've assumed the following constraints --- while none of these are set in stone, they are set in code that would require relatively deep changes to undo:
+
+1. Several components use the **flow** configuration independently, and we treat is as, effectively, a global constant. As such, the **flow** configuration itself should be static. It allows for conditionals, so the user's flow can respond to change, but all possible paths should be in the flow at startup.
+   1. Conditionals in the flow expect a simple boolean value from the Factgraph --- any complex logic for flow conditionals should be computed in the Factgraph and exposed as a boolean derived fact
+2. The application's overall state is opaque to react, which makes having multiple independent parts of the application concurrently rendered and synchronized difficult. For example, if we wanted to keep the checklist visible as the user entered facts, and have the checklist change its state automatically as the user completed various facts, this would require some creative engineering. As it is currently, this is not an issue --- react is re-rendering whole screens as the user works through the application, and it pulls the facts it needs for a given screen from the Factgraph on render. From the user's perspective, the checklist _is_ always up-to-date, because every time a user looks at the checklist, it will re-render and pull the latest overall state from the factgraph to calculate itself. The same is true of each individual screen in the interview --- they all replace one another, which necessitates re-rendering, and therefore screens are always constantly from the factgraph. We only run into problems if we want multiple screens that currently are considered to be rendered alone to be rendered together on the same screen and react to one another.
+3. While we can reference facts in translation strings (via interpolation), the template language in i18next is VERY simple. I don't have a good solution here, other than read the docs and expect we'll need to code some solutions around it.
+
+
+## Status
+Documentation
--- a/docs/adr/adr-frontend-client.md
+++ b/docs/adr/adr-frontend-client.md
@ -0,0 +1,154 @@
+# ADR: Front-end client
+Updated: 3Jan2023
+
+## Background
+> There are two general approaches to building web applications today: 
+> traditional web applications that perform most of the application logic on the server to produce multiple web pages (MPAs),
+> and single-page applications (SPAs) that perform most of the user interface logic in a web browser, 
+> communicating with the web server primarily using web APIs (Application Programming Interfaces).
+
+There's a 3rd option of having some lightweight javascript interation on just a few pages, but these fall into the MPAs.
+An example is a mostly static sites with self-contained form pages that need client side validation of input before submitting.
+
+This document assumes the 2024 scope ONLY.
+
+### REQUIRED features
+In no particular order: (todo: reorder based on priority
+1. Produces static website that interacts via Secure API
+1. Support for the top 98% of browsers at time of launch
+1. 508 Accessiblity compilant / USWDS style library
+1. Multi-langage support (English and Spanish) aka "i18n" (internationaliazation => i...*(18 chars)*...n)
+1. Mobile friendly (format and load speed)
+1. Secure/Trusted client libraries
+1. Deals with PII and so privacy/security is important
+1. Safe to use from shared computer (such as a public library)
+1. Security best practices. e.g. CORS and CSP (Content Security Policy)
+1. Support for multiple pages and complex links between them.
+1. Agency familiarity with libraries
+
+### NOT REQUIRED features
+1. Offline/sync support
+1. Not building a Full CMS (Content management system) like WordPress/Drupal
+1. Not supporting older browsers or custom mobile browsers (feature-poor, less secure)
+
+
+## Client code library
+There are many choices for client-side code libraries to help reduce custom code.
+
+Using one of these libraries increases security since they have been widely vetted and follow many best-pratices. It also reduces the cost of development/maintenance since the more popular a client library is, the more coding talent and tooling is available. The most popular libraries also allow for a common, shared knowledge for the correct way to solve a problem.
+
+One issue with larger client-side code libraries is that they have deeper "supply chains" of software. This is where a top-level library uses features of some other library, which in turn, includes even more 3rd party libraries. This nested approach continues until there are 50k+ files needed to build even the most simple application. This dependency amplification tends to make common functions more shared, but can also greatly increase complexity when an update is needed in a deeply nested package libraries, especially when there are multiple revisions depenencies (higher level libraries explicietly require different versions of the same package).
+
+Code scanners can find and automatically patch these nested packaged library updates (example: github's dependabot) to reduce constant workload overhead of updating. Fully automated unit tests are used to verify updating a dependency didn't break anything; therefore, creating rich unit tests becomes a crutial aspect of modern projects. 
+
+### Evaluation criteria
+1. **Supports required features from above**
+1. popularity
+1. static build output (easier ATO, "serverless")
+1. support for unit tests
+
+There's an added wrinkle that many libraries (React, Angular, etc) have a "add-on" approach and use _other_ libraries to get features. NextJs and Gastby are examples for the React ecosystems.
+
+### Combos
+
+| Libary + AddOn      | Pros                        | Cons                  |
+----------------------|-----------------------------|-----------------------|
+| React v18 + i18Next | Minimal complexity - full featured | None?                      |
+| React + Gastby      | Good at i18l. Great plugins | Uses GraphQL not REST. Larger output |
+| React + NextJS      | More like a CMS             | Mostly SSR, recent static site support. Larger output |
+| React + Progressive Web App | Great for mobile/offline  |  More complex deploy |
+| Angular v13         | Popular                     | Not as efficent / popular as React |
+| VueJS               | Smaller than react.         | Less popular                       |
+| Svelte              | Minimal size. Few plugins.  | Not widely used.                      |
+| JQuery              | Minimal size.               |  Requires lots of custom code to do things |
+| Vanilla JS          | Minimal size.               |  Requires lots of custom code to do things |
+
+
+## Tooling
+1. Code quality scanners: for quality, security issues
+1. Dependency scanners: for 3rd party libraries and their nested includes that alerts for outdated dependencies
+1. Unit testing: automated tests to verify deep core functionality
+1. Integration testing: aka end-to-end testing
+
+### Microsoft's Github toolchain
+Correct tooling is critical. Github offers a complete CI/CD ecosystem for automating: code quality scans, secrets scans, unit tests, library updates, deployment, 508 compliance scans. It's widely used recieving constant updates and improvements.
+
+It also has a ticketing/tracking system for code bugs/improvements that integrates with code changes. It can be configured to only submit code once all automated checks have passed and another team member has review and approved changes.
+
+Many of these features are free for open source projects.
+
+If we do NOT use github's CI/CD toolchain, then we'll need to find other products to fill these needs and staff expertise for these alternate tools. 
+
+This tightly couples the project to github, but offers so many benefits for starting projects, that it's probably the correct path.
+
+#### Non-public (Open Source) projects
+Many quality/security scanning tools need access to the source to work. This is trivial for open source projects, but require more work for closed source. These may have licensing costs for non-open source projects or concerns about exposing sensitive code to 3rd parties.
+
+The FE project will _**heavily**_ leverage existing open source projects: React, typescript, USWDS, etc.
+
+#####Opinion
+The Front End code should be open sourced.
+
+Rational:
+1. All FE "source" is "client-side script" (javascript). This means the transpiled source script is downloaded into the public browsers to run. It is **_not_** possible to "hide" this code since it runs in the end user's browser.
+1. Tax payers funded the development of this source (aka the American Public); therefor, the result of that work should be public unless there's some reason for it not to be.
+1. Open source can be more widely vetted for bugs/issues.
+1. If designed accordingly, the Front End could be just a presentation layer that interacts with the user and that uses a verified server-side API to do proprietary work.
+
+### Libraries
+Libraries to support common functionality allows for quicker development AND avoiding common security mistakes. Popular open source libraries are widely vetted and patched frequently.
+
+## Decision 
+The early Usability Study portion of the initial product cycle (meaning as we're creating the first iteration of the product) is a good time to try out different libraries and drive a decision.
+
+[Updated: March 8th, 2023:]
+For the first batch the technologies tried out:
+1. `React v18 with Hooks`
+1. `typescript` everywhere
+1. `i18Next` for multilingual / localization
+1. `Redux` for state machine needs
+1. Trussworks's `react-uswds` for USWDS+React
+1. `jest` for unit tests
+1. `cypress` for functionality testing
+
+## Rationale
+**React v18**
+
+A standard reactive UX library that's used widely by various Government agencies. It's well documented and easy to hire sofware developer talent with prior experience.
+
+**typescript**
+
+A modern standard that "levels up" javascript from a loosely typed scripting language to a strictly typed compiled (transpiled) language. Helps catch bugs at compile time versus runtime, enforces that types are correct, etc
+
+**i18Next**
+
+One possible multilanguage (translation/internationalization) library to make translation into Spanish
+
+**Redux**
+
+...
+
+**react-uswds**
+
+...
+
+**jest**
+
+...
+
+**cypress**
+
+...
+
+## Rejected 
+
+## Assumptions
+
+## Constraints
+
+## Status
+Pending
+
+## Consequences
+
+## Open Questions
--- a/docs/adr/adr-language.md
+++ b/docs/adr/adr-language.md
@ -0,0 +1,92 @@
+# ADR: Backend Programming Language
+DATE: 12/14/2022
+
+## Background
+
+What general programming language should we use for the back end tools we create?  This includes the tax engine, the APIs, any code generation tools, and anything else that we might need that isn't part of the client software.  We generally think it is advisable to keep the number of languages low to facilitate developers working on multiple parts of the system, which will help keep the number of required contractors down.  There are both technical and nontechnical reasons why a language might be chosen.  
+
+#### Technical Questions
+
+1. **How well does the language natively support graph concepts?**  In the prototype we used a graph to express tax logic as configuration, with the hope that one day we could build a system that allows nontechnical users to update that logic.  We believe this was a sound concept and will be carrying it forward into the new system.
+2. **How well would the language support building APIs?** Our plan is to use REST endpoints to retrieve and save data.  We require a system that has strong support for these concepts and has been used at scale.
+3. **How well does the language support code generation?** We believe that code generation will be a key component of the system.  Our front end pages and some of their logic will be generated from our backend configuration, and as such we need a language that lends itself to this kind of activity.  This could include access to compiler tools or a strong domain modeling ability in the language.
+4. **Does the ecosystem of the language support SOAP/WSDL/SOA concepts?**  The MeF API describes itself using WSDL and will expect communication through this enterprise architecture style communication pattern.  Our language should have an ecosystem that supports this kind of communication/service.  This means that we should be able to point to a WSDL and have calls and possibly domain objects generated with any included validation.
+5. **How strong is the tooling, the ecosystem, and the available libraries for the language?**  The more we can automate, and the more we can rely on third parties for common interactions, the better.  Our language should have connectors to commonly available cloud tools.  It should have an IDE that works well, has a built in test runner, and has realtime support for warning and errors.  We should not have to gather applications and create our own build tools.
+6. **How popular is the language and will it last?** Few things in programming are worse than building in a language that the community has abandoned.  Languages, frameworks, and tools become popular and then disappear just as quickly.  Some languages last for decades only to fall out of favor for newer tools.  Our goal is to find a language that is popular enough that it won't disappear, i.e. it has buy-in from large companies and industry domains.
+
+#### Non Technical Questions
+
+1. **How likely are we to find contractors that know the language(s) well?** This is a tricky concept because it relies on two separate factors.  The first is the pure number of contractors we imagine exist in a given language based on the popularity of the language and its use in government projects.  The second is the relative quality of those contractors based on the ease of use of the language.  Some languages are generally used by high skilled coders, and others are open to a wider audience where the skill level varies substantially.
+
+2. **What is the IRS likely to accept?** What languages do they currently support?  What sort of process have they built up around those languages and the tools that support those languages?  This is really the ultimate question, as anything the IRS wouldn't accept is out by default!
+
+## Decision 
+
+We have decided to take a hybrid Scala and Java approach.
+
+## Rationale 
+
+The rationale can best be couched in the questions above.  In short though, we believe that Scala best matches the domain and the problems at hand while Java has some of the tooling and auto generation features we would want for external communications.  They work interchangeably, so this shouldn't present a problem.
+
+#### Technical Questions
+
+1. Scala, and other functional languages have strong support for graph concepts.  When building graph systems in other languages, it is common to employ functional concepts when architecting and coding the system.  Even basic concepts like immutability and a lack of nulls helps to cut down on the amount of code required to generate the system.  Beyond that, because of its natural affinity for recursion, Scala can handle traversal idiomatically.  OO languages end up with extra lines and can be less expressive for the concepts we will be using.
+2. This is one of the areas where the hybrid approach with Java shines.  For our API layer we will fall into a common Java Spring pattern, which allows us all of the great tooling and speed of development that that ecosystem offers.  Both Scala and Java have been used at Scale (Scala comes from the word scalable).
+3. Functional languages are known for the ease in which a developer can model a domain and perform actions based on those models.  Code generation, ignoring the read in and write out steps, is a pure process in the sense that it is idempotent.  Working with pure functions is what functional languages are good at.  Combining the two, a powerful domain model and pure functions, is a recipe for simple, powerful code generation.
+4. Again, this is where the Java hybrid approach comes in.  Java was strong in the days of SOA, and in fact, many of the books about SOA used Java as the lingua franca.  It has amazing tools to work with WSDL, to process SOAP messages, and to handle anything we aren't expecting that may come from the SOA world.  
+5. Our belief is that the tooling is strong around both Scala and Java.  The intelliJ IDE works well and meets all of the criteria of a modern, full service IDE.  There are also alternative choices, like VSCode, which also meets all of the needs of a developer on a large scale system.  Twitter, and many financial institutions have been using Scala successfully for many years.
+6.Scala has been around for 18 years, and is one of the most popular languages on the JVM.  It is the domain language of choice in the financial space.  It isn't going anywhere any time soon.
+
+#### Non Technical Questions
+1. Scala is a difficult language.  Many functional concepts will require exploration for mid level and novice programmers.  We believe that the kinds of contractors who will be available will be more senior and will be better able to contribute long term to the project.  We also believe that this language difficulty will protect us from common, simple mistakes at the cost of a higher ramp up time and a smaller potential contractor pool.
+
+2. Because Scala runs on the JVM it should be acceptable.  All of the support systems are already in place within the IRS to update and manage Java systems.  Except for the code, our systems will appear no different.  They will use common Java libraries, common java tools, and require the same sorts of maintenance.
+
+## Rejected Languages
+
+Below is the set of languages we measured and rejected in order of how well suited we feel they are to the problem space
+1. **C#**: This was a close contender with Scala.  It checks all of the boxes, but we felt that Scala is better for graphs and is closer to the stack that the IRS is comfortable with.
+2. **Kotlin**: This language is very popular in the JVM ecosystem.  It fell behind C# because of our lack of familiarity and its relative newness.  It had the same pitfalls as other OO languages.
+3. **Java**: Java was strongly considered, as it is the main language of the IRS, but we felt that it lacked many features that would make developing this application quick and easy.
+
+The rest (no particular order):
+- **F#**: This language has many of the features of Scala, but it is on the dotnet framework.  We don't know if the IRS currently manages any dotnet systems.  It is also a little less common than Scala.  If we were to pick between the two for a general large scale production application, Scala would be the winner.
+- **Rust**: We are just too unfamiliar with Rust to choose it to build something like this.  It is also very new which comes with a lot of painful changes later.
+- **Go**: This is another language that might be great, but we don't know enough about it.  The ramp up time would be more than we have.
+- **Ruby**:  We built the prototype for most of this work in Ruby.  While it worked well, we don't want to have to go through open source package management nightmares in a production application.  
+- **Typescript/Node**: Same as Ruby.  The dependency on abandoned projects with security holes is a problem.  Those packages can also introduce versioning conflicts and a whole host of other problems that we would rather not deal with from a support angle.
+- **Python**: Same as Ruby and Typescript!
+
+
+## Assumptions
+
+- We will represent tax logic in a graph 
+- Rest endpoints are the preferred way of handling client-server communications
+- Code generation is a workable approach
+- MeF uses SOA concepts like WSDL and SOAP
+- Contractors will be of higher quality
+- Contractors in some languages are better than others
+- Functional languages are better for dealing with graphs
+- Functional languages are good for code generation for the reasons stated above
+- The IRS doesn't use dotnet
+- The hybrid approach is simple and easy to do
+- Ruby/Python/Node have package issues that make them less desirable for government work
+
+
+## Constraints
+- The language should be understood by the team.
+- The language should have practical use in large scale systems.
+- The language should have financial domain applications.
+- The language should be common enough that contractors are available.
+- The language should follow a known, common support model.
+- The language/ecosystem should be known by the government.
+- The language has support for our domain concepts
+- The language has existed for a long enough period of time to have gained common adoption.
+
+## Status
+Pending
+
+## Consequences
+The first, and most obvious consequence of this decision is that we won't be using another programming language.  This locks us into a specific set of tools and resources.
+
+
--- a/docs/adr/adr-new-user-allowed-feature-flag.md
+++ b/docs/adr/adr-new-user-allowed-feature-flag.md
@ -0,0 +1,59 @@
+# New Users Allowed Feature Flag for Phase C of Pilot Rollout
+Date: 1/17/2024
+
+## Status
+Complete
+
+## Context
+For the Phase C launch which will be controlled availability (and Phase D launch of limited availability), we want to limit **when** new users have access to Direct File and **how many** new users can access Direct File. 
+
+### Requirements
+- Allow new users to access Direct File during short open windows or when the max number of new users has been reached during an open window
+    - Max number of new users will be in the hundreds or thousands
+- The open windows will be brief, for example a 2-hour period that is defined the day before, and will be unnanounced
+- Outside of the open windows, no new users will be allowed to use Direct File
+- Outside of the open windows, users who were previously allowed will be able to use Direct File as normal
+- Users and emails who were previously allowed during the Phase A email allow list launch should still be allowed during Phase C
+    - If users are ON the allowlist, they are always allowed regardless of max total users
+    - If users are NOT on the allowlist, their access will be granted if and only if the open enrollment window is open and max users haven't been reached
+- When we approach Phase D, all of the above requirements apply, except the open windows will be slightly longer, such as 1 week
+- Closing a window needs to happen within 20 minutes of the IRS decision to close. We will need to coordinate with TKTK to meet this SLA.
+
+## Decision
+
+### Use a config file in S3 as our feature flags
+Use a JSON file in S3 for Direct File feature flags that does not require a deployment to update.
+
+#### Config File
+- The JSON file will only contain feature flag variables that apply everywhere in the DF application starting with the following 2 flags for phase C. No PII will be included.
+    - `new-user-allowed`: boolean representing whether we are in an open window and new user accounts are allowed
+    - `max-users`: int representing how many max allowed users there can be in Direct File currently. This is a failsafe to prevent mass signups before the open window can be closed. This would be calculated by checking the number of rows in `users` where allowed is `true` before opening a window and adding the max number of new accounts we would like to allow in the open window. This does not need to be exact, stopping the creation of new accounts roughly around `max-users` is good enough
+
+### Direct File Implementation
+- Set up new default config JSON files `feature-flags.json` to store our feature flags, and additional config files per environment
+- DF Backend app polls for the config file every `1 minute` to pick up changes and load the file into memory
+- For every request to DF backend, check whether the user is allowed:
+    - Check whether user already exists in the `users` DB
+        - If yes the user exists: Check the allowed column to see if the user has been allowed
+            - If true: allow access to DF as normal
+            - If false:
+                - If `new-user-allowed` is true and `max-users` > count of records in `users` where `allowed = true`: set allowed to true for this user
+                - Else, respond with a 403 error code and a custom meessage 
+        - If no the user does not exist: Check feature flags and email allow list
+            - If `new-user-allowed` is true and `max-users` > count of records in `users` where `allowed = true`: set allowed to true for this user and allow the user access to Direct File
+            - Else if hashed SADI email matches an email in the email allow list: set allowed to true and allow the user access to Direct File
+            - Else, respond with a 403 error code and a custom message
+- A code deployment is needed in order to update the feature flags
+
+### Updating the feature flag file
+1. DF Backend PM will file an ticket with the updated config file. In the ticket, request to upload the new file to `artifact-storage` S3 in the correct environment
+1. A turnaround time of 5 minutes or less is expected
+1. DF backend will pick up the changes in the next poll for the new file to S3
+
+## Other Options Considered
+
+1. Store config values as DB record and have a TKTK dba run a DB query in PROD
+1. Use a secrets manager? Parameter store?
+1. Usie a real feature flag service so that business users could change them in a UI – long term solution
+
+We decided to use a config file approach due to the ease of setting it up and familiarity with the design and update process given the Phase A decision
--- a/docs/adr/adr-optional-facts.md
+++ b/docs/adr/adr-optional-facts.md
@ -0,0 +1,249 @@
+---
+# Configuration for the Jekyll template "Just the Docs"
+parent: Decisions
+nav_order: 100
+title: "Optional facts"
+
+# These are optional elements. Feel free to remove any of them.
+status: "Decided"
+date: "2023-11-14"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Optional Facts
+
+## Context and problem statement
+
+The fact graph does not have a way to represent optional facts and we need to set certain fields such as W2 boxes and middle initials to be optional.
+
+We need this feature for MVP.
+
+## Desirable solution properties
+
+We'd like the solution to this problem to
+
+1. Allow us to show in the frontend to users that fields are optional.
+2. Allow the user to leave the field blank and hit `Save and continue`.
+3. Allow the fact graph to properly calculate downstream calculations without this fact.
+
+## Considered options
+
+1. Create a mechanism to allow specific writable facts to be left incomplete
+1. Create the idea of null or default values in the fact graph/fact dictionary
+
+## Decision Outcome
+
+Chosen option: **Create a mechanism to allow specific writable facts to be left incomplete**
+
+We do this by creating a second derived fact that is always complete. The original fact is prefixed with `writable-`. The derived fact uses the incompleteness of the writable fact to provide a default value. See below for instructions.
+
+Reasoning: This one doesn't complicate our fact system with the ideas of placeholders, default values, `hasValue`s, validity etc because it keeps writable values separate from the downstream value. So it might be easier to do and easier to understand.
+
+However, we felt in a v2 future version, we might do the other option. For now, we have proceeded with this option.
+
+### Consequences
+
+1. Good: We can start marking fields as optional.
+2. Good: We can test that optional fields aren't breaking the flow downstream.
+3. Good: We can test that visual treatment for optional fields works for users.
+4. Bad:  The scala option where we add the concept of optional fields into the graph itself might be more comprehensive.
+
+### How to make a fact optional
+
+Engineers looking to make a fact optional can do the following.
+
+1. Find the component in `flow.tsx`. Not all fact types are enabled for optionality!!
+
+   Only `GenericString` fields and `TextFormControl` based fields (Dollar, Ein, Pin etc.) are enabled for optionality.
+
+    ```xml
+    <Dollar path='/formW2s/*/wages'/>
+    ```
+
+* Add `required='false'` and rename to `writableWages`.
+
+  We must rename the fact, because we have to ensure no derived facts get marked as optional.
+
+    ```xml
+    <Dollar path='/formW2s/*/writableWages' required='false' />
+    ```
+
+2. Find out if there should be a default value for this fact, or if it can be left incomplete. 
+   * Dollar values can often use 0 as a default.
+   * Strings or enums may not have a correct default, and we may want to leave these incomplete.
+   * An incomplete fact can break other parts of the flow so we should verify with backend before leaving it incomplete. 
+   * Reach out on Slack to get guidance from backend team members.
+   
+   **IF DEFAULT IS NEEDED:** Find the fact in `ty2022.xml` or other fact dictionary files, and create a derived version that is always complete.
+   It uses the incompleteness of the writable fact to provide a default value.
+
+* Replace this:
+
+    ```xml
+    <Fact path="/formW2s/*/wages">
+      <Name>Wages</Name>
+      <Description>
+        Box 1 of Form W-2, wages, tips, and other compensation.
+      </Description>
+    
+      <Writable>
+        <Dollar />
+      </Writable>
+
+      <Placeholder>
+        <Dollar>0</Dollar>
+      </Placeholder>
+    </Fact>
+* With this (be sure to remove `<Placeholder>` if there is one):
+
+    ```xml
+    <Fact path="/formW2s/*/writableWages">
+      <Name>Wages</Name>
+      <Description>
+        Box 1 of Form W-2, wages, tips, and other compensation.
+    
+        This is the writable optional fact. Can be left incomplete.
+        Please use the derived fact in downstream calculations.
+      </Description>
+    
+      <Writable>
+        <Dollar />
+      </Writable>
+    </Fact>
+    
+    <Fact path="/formW2s/*/wages">
+      <Name>Wages</Name>
+      <Description>
+        Box 1 of Form W-2, wages, tips, and other compensation.
+      </Description>
+    
+      <Derived>
+        <Switch>
+          <Case>
+            <When>
+              <IsComplete>
+                <Dependency path="../writableWages" />
+              </IsComplete>
+            </When>
+            <Then>
+              <Dependency path="../writableWages" />
+            </Then>
+          </Case>
+          <Case>
+            <When>
+              <True />
+            </When>
+            <Then>
+              <Dollar>0</Dollar>
+            </Then>
+          </Case>
+        </Switch>
+      </Derived>
+    </Fact>    
+    ```
+
+       
+   **IF DEFAULT IS NOT NEEDED:** Find the fact in `ty2022.xml` or other fact dictionary files, and create a derived version that may be incomplete.
+
+* Replace this (don't remove the Placeholder):
+
+    ```xml
+    <Fact path="/formW2s/*/wages">
+      <Name>Wages</Name>
+      <Description>
+        Box 1 of Form W-2, wages, tips, and other compensation.
+      </Description>
+    
+      <Writable>
+        <Dollar />
+      </Writable>
+
+      <Placeholder>
+        <Dollar>0</Dollar>
+      </Placeholder>
+    </Fact>
+    ```
+    with this:
+    ```xml
+    <Fact path="/formW2s/*/writableWages">
+      <Name>Wages</Name>
+      <Description>
+        Box 1 of Form W-2, wages, tips, and other compensation.
+      </Description>
+    
+      <Writable>
+        <Dollar />
+      </Writable>
+
+      <Placeholder>
+        <Dollar>0</Dollar>
+      </Placeholder>
+    </Fact>
+
+    </Fact>
+        <Fact path="/formW2s/*/wages">
+      <Name>Wages</Name>
+      <Description>
+        Box 1 of Form W-2, wages, tips, and other compensation.
+      </Description>
+    
+      <Derived>
+        <Dependence path='../writableWages' />
+      </Derived>
+    </Fact>    
+    ```
+
+3. Search the `flow.tsx` for other components using the writable fact and update the fact name.
+
+
+4. Rebuild the fact dictionary code
+
+* In  `direct-file/df-client/fact-dictionary` run
+
+    ```sh
+    npm run build
+    ```
+
+5. Edit `en.yaml` and the other locales to replace field labels for the writable fields
+
+    ```yaml
+    /formW2s/*/writableWages: 
+        name: Wages, tips, other compensation
+    ```
+
+6. Finally, run tests and fix any testcases that should have the writable field, or alternately the derived field.
+
+
+
+## Pros and Cons
+
+### Create a mechanism to allow specific writable facts to be left incomplete
+
+We do this by creating a second derived fact that is always complete. The original fact is prefixed with `writable-`. The derived fact uses the incompleteness of the writable fact to provide a default value. 
+
+#### Pros
+
+* Lower lift
+* Doesn't complicate the fact system with placeholders and potentially avoids complex bugs in the fact system at this late stage.
+* Easier to understand.
+
+#### Cons
+
+* It's likely more correct and complete for our fact system to understand optionality.
+
+### Create the idea of null or default values in the fact graph/fact dictionary
+
+In this method, we would change the scala and the factgraph to handle the concept of null or default values.
+
+If something has a default value in the fact dictionary, we pick that up on the frontend (similar to how we do enumOptions) and then let a person skip the question, using the default value in its place.
+
+#### Pros
+
+* This might be more logically correct as we're not overloading the concept of `incomplete` with `optional+empty`
+
+#### Cons
+
+* Bigger lift than the other option.
+* Requires scala rework which we have less engineering bandwidth for.
+* Currently, every writable fact is either `incomplete + no value`, `incomplete + placeholder`, or `complete`. Making this change would add a fourth state of `incomplete + default value` and that might have multiple downstream consequences in our application.
+
--- a/docs/adr/adr-pseudolocalization.md
+++ b/docs/adr/adr-pseudolocalization.md
@ -0,0 +1,106 @@
+---
+# Configuration for the Jekyll template "Just the Docs"
+parent: Decisions
+nav_order: 100
+title: "Pseudolocalization"
+
+# These are optional elements. Feel free to remove any of them.
+status: "Proposed"
+date: "2023-07-21"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Pseudolocalization
+
+## Context and problem statement
+
+Currently the frontend is using a spanish language translation file as an alternate translation. However, we do not have translations so all strings translate to TRANSLATEME. Also, we will likely always be behind in sourcing the translations.
+
+Pseudolocalization generates a translation file that translates “Account Settings” to  Àççôôûññţ Šéţţîñĝš !!!
+
+It allows us to easily catch:
+
+* where we’ve missed translating a string
+* where we’ve missed internationalizing a component
+* where a 30% longer string would break ui
+* where a bidirectional language would break the layout (perhaps not needed right now)
+
+Developers can test the flow and read the pseudolocalized strings as they navigate the UI. Basically errors become really easy to catch during normal development. 
+
+## Desirable solution properties
+
+We'd like the solution to this problem to
+
+1. Highlight issues for developers and design to catch early and often
+2. Not add additional load to development and design
+
+## Considered options
+
+1. Don't change anything, use fixed string for all translations
+2. Use a package to generate a new translated set of strings
+3. Use a package to hook into react and translate strings dynamically
+
+## Decision Outcome
+
+Chosen option: Use a package to generate a new translated set of strings.
+Update 20240417: Now that we have Spanish, we are removing the pseudolocalization
+
+Reasoning: Using a set of translated strings is the best way to catch if we've forgotten to hook something up correctly. Given our unusual use of the dynamic screens, FactGraph and translations, there's a possibility of bugs as we source translations for strings, facts, enums, etc. 
+
+If we don't change anything, we will continue to introduce more technical debt that will need to be fixed at a later date.
+
+Package chosen: https://www.npmjs.com/package/pseudo-localization
+
+### Consequences
+
+1. Good: Easy way to ensure we are keeping our codebase properly internationalized.
+2. Good: Devs can test the whole flow quickly without switching languages to understand what each field means.
+3. Good: We can automate updating the translation so it's even less load on the developers.
+4. Bad:  Maybe we run into an issue with the pseudotranslation that won't be an issue with a real translation?
+4. Bad:  Spanish will not be our second language in the UI switcher.
+
+## Pros and Cons
+
+### Don't change anything, use fixed string for all translations
+
+#### Pros
+
+* Straightforward, it's what we have already.
+
+#### Cons
+
+* We can't tell which fields are which and testing is harder.
+* May hide (or cause) issues that exist (or don't) with actual translations - For e.g. if 'Single' is selected but later 'Married' is shown, there's no way to tell if both strings are translated to 'TRANSLATEME'.
+* All strings are same length.
+
+### Use a package to generate a new translated set of strings
+
+Package chosen: https://www.npmjs.com/package/pseudo-localization
+
+#### Pros
+
+* Devs can test the whole flow quickly without switching languages to understand what each field means.
+* We can automate updating the translation so it's even less load on the developers.
+* Tests the UI with +30% string length.
+* This package appears to be the most flexible and popular option on npm. It has an MIT license.
+
+#### Cons
+
+* Requires ticket and work to update the translation script
+* Spanish will not be the second language (although once the language switcher supports more languages we can just have both)
+* This package was last updated in 2019 - so although popular enough, prefer to use it just as tooling vs hooking it directly into our codebase.
+
+### Use a package to hook into react and translate strings dynamically
+
+#### Pros
+
+* Can be hooked into frontend dynamically.
+
+#### Cons
+
+* Doesn't actually exercise our own flow and the aggregation and intrapolation of strings from flow.xml and translation JSON files. 
+* Could appear to "fix" strings that aren't actually properly internationalized.
+
+## References
+
+* More on pseudolocalization: <https://www.shopify.com/partners/blog/pseudo-localization>
--- a/docs/adr/adr-screener-config-update.md
+++ b/docs/adr/adr-screener-config-update.md
@ -0,0 +1,33 @@
+[//]: # ([short title of solved problem and solution])
+
+    Status: [proposed | rejected | accepted | deprecated | … | superseded by ADR-0005]
+    Deciders: [list everyone involved in the decision]
+    Date: [YYYY-MM-DD when the decision was last updated]
+
+## Context and Problem Statement
+
+[//]: # ([Describe the context and problem statement, e.g., in free form using two to three sentences. You may want to articulate the problem in form of a question.])
+A decision was made in [adr-screener-config](./adr-screener-config.md) to use Astro SSG (static site generator) for the screener application. It was initially used for an MVP, and later replaced with React/Vite. This adr is to document that change and supercede the previous adr. 
+
+## Decision Drivers
+
+    - the realization that the application needed to support more dynamic features such as react-uswds and i18n features.
+
+## Considered Options
+
+    React/Vite
+
+## Decision Outcome
+
+Chosen option: "React/Vite", because it was consistent with the client app and the approach to i18n could be consistent across the screener and client apps.
+
+### Positive Consequences
+
+    - More dynamic content is an option
+    - We can easily utilize react-uswds library.
+    - The i18n system is aligned in both the screener and the client app. 
+    - Engineers don't need to learn multiple systems and can seemlessly develop between the two apps. Onboarding for new engineers is simplified.
+
+### Negative Consequences
+
+    - It's more complex than a more static configuration would be
--- a/docs/adr/adr-screener-config.md
+++ b/docs/adr/adr-screener-config.md
@ -0,0 +1,121 @@
+---
+# Configuration for the Jekyll template "Just the Docs"
+parent: Decisions
+nav_order: 100
+title: "Screener static site"
+
+# These are optional elements. Feel free to remove any of them.
+status: "Decided"
+date: "2023-08-08"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Screener static site
+
+superceded by: [adr-screener-config-update](../adr-screener-config-update.md)
+
+## Context and problem statement
+
+The screener is a set of four to five pages that serve as the entry point to the direct file. They will contain a set of simple questions with radio buttons that advance the user to the login flow or "knock" them out if any answers indicate they do not qualify for direct file.
+
+Since the frontend will not be saving the user's answers to these questions, we have an architectural choice on how to create these pages. These will be unprotected routes not behind our login so the SADI team recommends hosting static pages for this content.
+
+Also, we need a way to make changes to content, order of questions, and add or remove pages.
+
+## Desirable solution properties
+
+We'd like the solution to this problem to:
+
+1. Allow the team to edit the config to change the order of questions and add or remove pages entirely.
+2. Allow the team to make changes to content of the screener.
+3. Allow the team to deploy the app with high confidence that no regressions will occur.
+4. Have the static site support i18n and possibly reuse keys from existing client translation files.
+
+## Assumptions
+
+- We can change content and the config after the code freeze (will most likely require policy changes)
+- No answer to a question in the screener is derived from a previous one with the exception of the initial state dropdown
+- Every question (except the first state selection dropdown) will be a set of radio buttons with either:
+    - "Yes" or "No"
+    - "Yes," "No," or "I don't know" (and more content specific answers for the third option)
+- The static site generator (SSG) will live in the same repo, just within another folder.
+
+## Considered options
+
+1. Use an SSG (static site generator) to create static HTML pages
+2. Create another React SPA with a custom JSON or TS config
+3. Integrate existing into existing direct file React SPA
+
+## Trying to answer feedback
+
+### Why not just plain HTML and JS?
+
+The library we're currently using in the DF client for i18n is i18next. It's certainly possible to use i18next in a Vanilla JS and HTML site, but I think it's worth trying to reuse our existing approach using the react-i18next library. SSG libraries also provide a better developer experience in updating this content.
+
+### Ok. Why not another Vite/React SPA?
+
+The site is limited and fairly static so creating a SPA doesn't seem to fit the use case. Content largely drives the screener so a static site with a multi-page architecture fits the use case better than a fully-fledged SPA where lots of app-logic would be necessary.
+
+## Decision Outcome
+
+**Chosen option**: Use an SSG to build the app.
+
+**Reasoning**: Using an SSG library, we can safely deploy the screener pages outside of the main direct file app. This allows us to have the unprotected routes of the app separate from the protected ones. This allows us to integrate with SADI more easily and also support changes to content quickly.
+
+Also, with an SSG library (instead of plain HTML and JS) we can use the features of i18next more easily like we are doing in the DF client.
+
+If we choose to integrate the screener into the React SPA, this will make it more complicated to secure these routes.
+
+**Library Chosen**: Astro. We can revisit this choice if it proves not to work for us, but it's focused on content, ships as little JS as possible, provides React support if we need it, and provides good i18n support with an i18next library.
+
+Consequences
+
+- Good: Astro supports React so we can reuse components from the client
+- Good: Astro has i18next libraries specific and we may even be able to use react-i18next
+- Good: Creating all the pages at build will result in separate pages/paths for each language (i.e. /en/some-page, /es/otra-pagina).
+- Good: We can can make changes to the questions, content, and pages fairly easily.
+- Good: We can use snapshot testing to have a high degree of certainty that the UI doesn't change unexpectedly.
+- Good: Easier to integrate with SADI if these pages are separate.
+- Good: Can use i18next with the library to keep our translation mechanism the same, perhaps use one source of translation files.
+- Bad: Will require separate CI/CD build pipeline.
+- Bad: Another place to maintain content and possibly a config.
+- Bad: Another tool for frontend developers on the team to learn.
+
+## Overview of SSGs
+
+### Astro
+
+- Focused on shipping less JS, focus on static HTML output
+- High retention in [state of JS survey](https://2022.stateofjs.com/en-US/libraries/rendering-frameworks/)
+- Support for multiple frameworks including React
+- Support for TypeScript
+- Good i18next integration
+
+### Vite
+
+- Already have this tool in the codebase
+- Provides TS support, React, i18n
+- More inline with an SPA in mind
+- SSG support doesn't appear to come "out of the box," [but it is possible](https://ogzhanolguncu.com/blog/react-ssr-ssg-from-scratch).
+
+### Gatsby
+
+- Support for React
+- Large plugin ecosystem
+- Support for TypeScript
+- Seems unnecessarily complex given our needs (GraphQL features)
+
+### Next.js
+
+- Support for React
+- SSR tools on pages if necessary
+- Support for TypeScript
+- Seems unnecessarily complex given our needs (SSR)
+
+### Eleventy
+
+- Multiple template languages
+- Supports vanilla JS
+- Flexible configuration
+- Very fast build time
+- Good for sites with a lot of pages
--- a/docs/adr/adr-tax-year-2024-development.md
+++ b/docs/adr/adr-tax-year-2024-development.md
@ -0,0 +1,34 @@
+# Direct File Development 2024
+
+Date: 06/14/2024
+
+## Status
+
+## Context
+
+In our current state, Direct File cannot support filing taxes for two different tax years. That is, we cannot
+make changes that expand our tax scope or prepare us for TY 2024 without causing breaking changes to TY 2023 functionality.
+
+As an example of this, we've added a new required question for schedule B that asks every taxpayer, "Do you have a foreign bank account?". Since that question is new and required, every person who submitted a tax return for 2023 now has an incomplete tax return against main. 
+
+We essentially have two options:
+1. We can build a new system where we maintain multiple versions of the flow, fact dictionary, mef, and pdf that support multiple tax years and scopes at one time. This is a lot of work to do.
+2. We can not worry about backcompatibility and block production users from accessing the flow/fact graph. We use generated PDFs to expose last year's answers to the user. We have to build a way to block users from accessing the flow in production, but don't need to maintain multiple versions of the app. There is still a consequence that users will never be able to re-open last year's tax return in the flow. 
+
+
+## Decision 
+- Next year, we will probably have to deal with multi-year support and multiple fact dictionaries/flows/etc. But this year, we're concentrating on expanding our tax scope to expand to more users. That is the priority, rather than providing a good read-experience for last year's 140k users.
+- We are ok to make non-backwards compatible changes. We can make changes that will break an existing user's fact graph. We will never try to open last year's users returns in a fact graph on main.
+- We're going to continue deploying 23.15 to prod through late June 2024, possibly into Q3 2024, until a point where:
+    - We have generated all existing users tax returns into PDFs
+    - We have a feature flagging method so that existing users in prod will be able to download their tax return as a PDF in prod
+    - Users in other test environments will be able to run with the new enhanced scope and we'll be building for next year. 
+
+There's additional product thought that needs to go through what the TY2024 app looks like for users who filed in TY2023 through Direct File.
+
+
+
+## Consequences
+- This means that we will need to apply security patches to 23.15 separately from main. We should probably also set it to build weekly/regularly.
+- Main is going to move ahead for TY2024. The next version of Direct File will always be the best one. 
+- We have work to do to put in a feature flag for last year's filers' experience in production so that those users will be able to download last year's tax return as a PDF. 
--- a/docs/adr/adr-uswds-customization.md
+++ b/docs/adr/adr-uswds-customization.md
@ -0,0 +1,48 @@
+| ADR title | Custom USWDS configuration settings in 2024 Direct File Pilot |
+|-----------|-----------------------------------------------------------------|
+| status    | Approved                                                        |
+| date      | 2023-08-23                                                      |
+
+# Custom CSS theming of USWDS within Truss React Components
+
+## Context and problem statement
+
+The USWDS design system is implemented in Direct File using [ReactUSWDS Component Library](https://trussworks.github.io/react-uswds/?path=/story/welcome--welcome), an open source library built and maintained by Truss.
+Truss includes out-of-the-box CSS with each component, with some variation in color and component display options.
+Using a component library helps engineers build Direct File quickly while maintining a high degree of quality and consistency in the UI. 
+
+It is possible to [customize configuration settings](https://designsystem.digital.gov/documentation/settings/#configuring-custom-uswds-settings) within USWDS design system. Developing a shared understanding of this system across Engineers, Designers and PM's and then organizing and designing within these parameters is a sizeable effort for the team. It is a body of work that would add additinoal risk to meeting our project deadline. 
+
+Examples of these challenges are:
+- Anticipating knock-on effects of any particular customization has proven time-consuming. An example of this customization pertained to the header.  The design switched the header color to dark, and when the default stock menu was applied, the menu button text was dark grey, which is a visibility/accessibility issue. CSS was added to make that menu button text light but then in responsive mode, these buttons get moved to the dropdown nav, which has a light background. So then that menu button text needs to be dark, etc. This type of reactive color fixing creates an inconsistency and is time consuming. 
+- Coordinating common practices of how we implement design overrides at the code level is time-consuming and situational
+- Creating consistency across separate applications (i.e. the Screener and the Filing application itself) requires more overhead the further we deviate from a out-of-the-box implementation of USWDS
+
+Therefore, we propose that USWDS settings customization is out of scope for the Pilot not because it is not possible, but because at this point, investing time into customization is a lower priority than completing the base functionality required for the successful launch of the Pilot. 
+
+On a system wide level, design tokens provided by `usds-core` can be overwritten in `_theme.css`. Some system-wide theme guidelines currently exist in this file. It's my recommendation that we continue to utilize this pattern for CSS settings which are confidently applied application wide. 
+
+
+## Decision Outcome
+
+Custmizing the settings of USWDS within React Truss Compoents is not in scope for the 2024 Direct File Pilot. 
+
+### Consequences
+
+Pros:
+
+- Using out-of-the-box functionality enables designers and engineers to move quickly and meet our go/no go deadline for the Pilot
+- This does not have to be a permanent decision: we can go back later and develop a strategy for customizing the application in CY2024 or beyond
+- Taxpayers will experience a more consistent UX.
+
+Cons:
+
+- This limits the look and feel of Direct File Pilot for 2024. 
+- There is some in-progress work that will need to be abandoned and/or removed (tech debt).
+
+## Next Steps 
+- Designer leads review the Source of Truth to confirm sketches that are being used for Design Audit are consistent with USWDS out-of-the-box configuration, and not accidentally introducing/proposing customization in new Developer tickets. (Suzanne, Jess, Jen)
+- Designers working on Design Audit ensure new tickets written do not introduce customization to USWDS components.
+- PM/Delivery lead for Design Audit and Components must audit tickets as they come through to ensure that they are not introducing new customization to USWDS components.
+- All teams must be made aware to ensure future commits to Source of Truth mural accurately reflect this decision.
+
--- a/docs/adr/adr_a11y-methods.md
+++ b/docs/adr/adr_a11y-methods.md
@ -0,0 +1,78 @@
+---
+# Configuration for the Jekyll template "Just the Docs"
+parent: Decisions
+nav_order: 100
+title: "Accessibility testing methods"
+
+# These are optional elements. Feel free to remove any of them.
+status: "Accepted"
+date: "2023-07-11"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Accessibility testing methods
+
+## Context and Problem Statement
+
+Section 508 describes ways we can support people with disabilities, and many accessiblity tech/tools (AT) exist that support various tasks. We on this project team can use those tools to inspect our work, and we can use other tools that can simulate how those tools might work with the code we push to this repository. This decision record sets how we achieve our accessibility target (set in the "Accessibility Target" DR) in an agile way.
+
+## Considered Options
+
+* Review accessibility factors in every PR, where applicable
+* Review accessibility factors at least every other sprint (once per month)
+* Review accessibility factors in the final stages of this project before release
+* Supplement manual review with automated accessibility review using a tool like pa11y
+
+## Decision Outcome
+
+Chosen option: "Review accessibility factors in every PR, where applicable", because our goal is to prioritize accessibility, and this saves us time in the long run by not introducing any problems that could best be fixed at a PR level. 
+
+Per discussion internally and with TKT, team will be shifting towards (1) "Review accessibility factors at least every other sprint (once per month)"; and (2) "Supplement manual review with automated accessibility review using a tool like pa11y." This will ensure more automated coverage for a11y testing, and less designer time spent on manual review. 
+
+### Consequences
+
+* Good, because it is maximizes accessibility compliance opportunities
+* Good, because it *prevents* accessibility problems from showing up rather than waiting to fix them
+* Good, because it sets a policy that no obvious accessibility problems should ever be merged
+* Neutral, because it a critical feature could be held up by a relatively less critical but still significant accessibility problem (exceptions could fix this)
+* Bad (ish), because it will add about 10 minutes to each PR to thoroughly check accessibility.
+* Bad (ish) because some 508 violations (P0 and P1 bugs) may get merged into test or dev branches, but will never be promoted into production.
+
+### Confirmation
+
+The initial decision was confirmed by the IRS' 508 Program Office (IRAP) and by our Contracting Officer's Representative. Project technical staff will continuously confirm adherence to this decision using a PR template, which includes a checklist of items they need to do before it can be merged. 
+
+The updated decision to supplement manual testing using pa11y or similar automated testing tool.
+
+
+## Pros and Cons of the Options
+
+### Review accessibility factors in every PR, where applicable
+
+* Good, because it is maximizes accessibility compliance opportunities
+* Good, because it *prevents* accessibility problems from showing up rather than waiting to fix them
+* Good, because it sets a policy that no obvious accessibility problems should ever be merged
+* Neutral, because it a critical feature could be held up by a relatively less critical but still significant accessibility problem (exceptions could fix this)
+* Bad (ish), because it will add about 10 minutes to each PR for the submitting engineer and for the (likely designer) accessibility reviewer to thoroughly check accessibility.
+
+### Review accessibility factors at least every other sprint (once per month)
+
+* Good, because it is a relatively frequent check for introduced bugs
+* Neutral, because it sets a middle ground between maximizing accessibility compliance and building capacity
+* Bad, because reviews may catch a lot of bugs and require prioritization work
+
+### Review accessibility factors in the final stages of this project before release
+
+* Good, because it maximizes our capacity for building *something*
+* Bad, because it is the opposite of best practices
+* Bad, because problems discovered at that point may be too many to fix for our deadline
+
+### Supplement manual review with automated accessibility review using a tool like pa11y
+
+* Good, because it maximizes coverage by automating executed test cases against our a11y targets
+* Good, because it frees up design time spent on manual reviews
+* Bad (ish), because it will require some dedicated set up time from devops
+
+## More Information
+* Our [PR template](../PULL_REQUEST_TEMPLATE.md) and [design review instructions](../design-review-process.md) are the operational components of this decision. These are based on [documents produced by Truss](https://github.com/trussworks/accessibility) in their past projects.
+* In case a PR for some reason needs to be merged without an an accessibility review, an issue should be made and immediately queued up for work so that any potential issues can be caught as soon as possible.
--- a/docs/adr/adr_a11y-target.md
+++ b/docs/adr/adr_a11y-target.md
@ -0,0 +1,94 @@
+---
+# Configuration for the Jekyll template "Just the Docs"
+parent: Decisions
+nav_order: 100
+title: "Accessibility targets"
+
+# These are optional elements. Feel free to remove any of them.
+status: "Accepted"
+informed: "All project technical staff"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Accessibility targets
+
+## Context and Problem Statement
+
+In supporting a critical task for all U.S. residents, Direct File should be accessible. This decision record sets a clear target for this phase of development to meet those accessibility needs, and we expect to exceed that target when feasible.
+
+The U.S. describes its accessiblity requirements in [Section 508](https://www.section508.gov/) of the Rehabilitation Act. [WCAG](https://www.w3.org/WAI/standards-guidelines/wcag/) provides the same for the internet at large with three levels of compliance (A, AA, AAA), and it has increased in minor versions (2.0, 2.1, 2.2) over the last 15 years. 508 and WCAG have a [very large overlap](https://www.access-board.gov/ict/#E207.2), and all non-overlapping features unique to 508 are either irrelevant to this project (e.g. manual operation and hardware) or out of this repo's scope (e.g. support channels). See the note in "More Information" below for further information.
+
+Given these equivalencies, all considered options are oriented toward WCAG and achieve 508 compliance.
+
+## Considered Options
+
+* WCAG 2.0 A and AA (base 508)
+* WCAG 2.1 A and AA (508 superset)
+* WCAG 2.1 AA (enhanced superset)
+* WCAG 2.2 AA (forward-thinking)
+
+## Decision Outcome
+
+Chosen option: "WCAG 2.2 AA (forward-thinking)", because it ensures we meet our baseline and exceed our goal
+
+### Consequences
+
+* Good, because it establishes and exceeds 508 compliance
+* Good, because it is easy to remember the level expected of all elements (i.e. "what needs level A vs AA?")
+* Good, because it challenges us to maximize WCAG at level AAA
+* Good, because it sets Direct File up with the latest a11y guidance for years to come
+* Good, because it doesn't require any more work than WCAG 2.1 AA with our current design
+* Neutral, because it may require slighly more work than the bare minimum in the future
+* Neutral, some automated a11y tools don't yet support 2.2 (as of Oct 2023)
+
+### Confirmation
+
+This decision to exceed 508 requirements is confirmed by the IRS' 508 Program Office (IRAP) and by our Contracting Officer's Representative, Robin Coplen. IRAP will only ensure that the baseline is met (2.0 A and AA). Project technical staff will continuously confirm adherence to this decision's specific version and level using accesibility evaluation tools and manual evaluations when/where appropriate.
+
+## Pros and Cons of the Options
+
+### WCAG 2.0 A and AA (base 508)
+
+* Good, because it establishes 508 compliance
+* Good, because it is the easiest level to achieve
+* Bad, because omits updates and new tech since its publication 15 years ago
+* Bad, because it does not exceed 508 compliance, as called for in our mission
+
+### WCAG 2.1 A and AA (508 superset)
+
+* Good, because it establishes 508 compliance
+* Good, because it is the easiest level to achieve for the latest published guidance
+* Bad, because it does not exceed 508 compliance, as called for in our mission, for the latest technologies like mobile devices
+
+### WCAG 2.1 AA (enhanced superset)
+
+* Good, because it establishes and exceeds 508 compliance
+* Good, because it is easy to remember the level expected of all elements (i.e. "what needs level A vs AA?")
+* Good, because it challenges us to maximize WCAG at level AAA
+* Neutral, because it may require slighly more work than the bare minimum
+* Neutral, because it is an outdated version (see "more information" below)
+
+### WCAG 2.2 AA (forward-thinking)
+
+* Good, because it establishes and exceeds 508 compliance
+* Good, because it is easy to remember the level expected of all elements (i.e. "what needs level A vs AA?")
+* Good, because it challenges us to maximize WCAG at level AAA
+* Good, because it sets Direct File up with the latest a11y guidance for years to come
+* Good, because it doesn't require any more work than WCAG 2.1 AA with our current design
+* Neutral, because it may require slighly more work than the bare minimum in the future
+* Neutral, some automated a11y tools don't yet support 2.2 (as of Oct 2023)
+
+## More Information
+
+* For a complete description of Section 508, see https://www.access-board.gov/ict/
+  * WCAG 2.1 is a superset of WCAG 2.0, which will be the official 508 version until Access Board approves an update, which can take years.
+* Most automated testing tools use version 2.1, so 2.0 compliance may appear as failing on tools and 2.2 may not be available. In 2.2's case, it is largely backwards-compatible to 2.1 AA assuming all other criteria are met; therefore, if tools don't reflect 2.2 we can treat their 2.1 AA conformance as equivalent.
+* Most testing tools are focused on level (A, AA, AAA) of compliance, so setting level AA as our target would be more memorable, programmable, and communicable.
+* How we achieve these targets, including tooling and processes, are described in a separate decision record.
+* WCAG 3 is still in a working draft phase, but it provides some guidance that may be truly better but technically non-compliant with 2.2 due to backward incompatibility. For example, WCAG 3.0 uses a new [color contrast algorithm (APCA)](https://github.com/Myndex/SAPC-APCA/blob/master/documentation/WhyAPCA.md#why-the-new-contrast-method-apca/) that better matches reality. In our effort to exceed targets, we may want to use [this calculator](http://www.myndex.com/APCA/) to test contrast among other things to allow us to both exceed our targets and be forward-thinking.
+* Things 508 covers that WCAG does not explicitly (TLDR: there are no issues):
+  * **Functional Performance Criteria**: at least one method must be provided allowing individuals with disabilities to interact with the product. Most of these are [explicitly covered by WCAG](https://www.section508.gov/content/mapping-wcag-to-fpc/), but two are not:
+    * Without Speech - where speech is used for input, control or operation, ICT will provide at least one mode of operation that does not require user speech. We will not have any speech-based features built into this project, so this is not an issue.
+    * With Limited Reach and Strength - where a manual mode of operation is provided, ICT will provide at least one mode of operation that is operable with limited reach and limited strength. We will not have any manual operation methods, so this is not an issue.
+  * **Hardware and software**: 508 applies to websites like this, as well as hardware and operating systems. We are not building these, so this is not an issue.
+  * **Alternative Means of Communication**: 508 ensures that people with disabilities can effectively communicate and interact with support personnel. Examples of alternative means of communication include relay services for individuals with hearing impairments or providing accessible contact options for individuals with disabilities.
--- a/docs/adr/adr_config_storage.md
+++ b/docs/adr/adr_config_storage.md
@ -0,0 +1,60 @@
+# ADR: Configuration Storage
+DATE: 12/08/2022
+
+Where should the configuration live, how can we ensure that it is version controlled, and what does a change process look like?  At a minimum the change process should include the time and date of the change, the user who changed it, and a note from the user indicating why they made the change.  A more advanced system would have sign offs that are also recorded.  These actual configuration, and the changes to the configuration should be stored in git, but this may not be the best tool for disseminating changes out to containers running our software.  For that we will need an accessible system that meets or exceeds our SLO.  The focus of this document is how to store the data in such a way that it is made available to every container effectively, including audit systems, editing systems, and the tax engine itself.
+
+## Decision 
+
+We should store the configuration of the tax rules in S3/a document store.  The basic pathing will look something like the following:
+```
+/{schema-version}/
+/{schema-version}/config1.xml
+/{schema-version}/config2.xml
+/{schema-version}/justification/
+/{schema-version}/justification/justification.xml
+```
+
+The justification is the who, what, when, where, and why of the change.
+
+## Rationale 
+
+We need a highly available method of updating n containers in m availability zones.  We haven't set any SLOs yet, but money is on them being pretty tight.  We need something that can be updated easily, and that won't require any downtime to make an update.  We will want to do a schema version validating as part of a system health check, but that should be part of a normal health check!  It could be stored in the database with the tax information, if we do indeed go that route, but it seems like extra work to mash all of the config into a blob when there is no reason to do so.  They are documents and should be in a document store.  
+
+## Rejected ideas
+
+#### on local storage/in container storage
+This is the easiest option, and what we used for the prototype, but it would be unwieldy at scale.  The update process for the schema would require a bit of downtime as we knock the containers down and bring them back up with the new configuration.  There are other ways around that, like supplying it through an endpoint, but that would require verification and be a whole mess.
+#### shared disk storage
+This isn't a bad option for how to deal with the tax rules schema changes, but there is a concern of people directly changing it.  Even if it is somehow secured and safe through another piece of software, tracking who changed it, when, and why, it would still represent a single point of failure in the system.  There are mitigations for this, but what it starts to look like is a document storage system, which is what this ADR suggests.
+#### blob storage
+Blob storage could work for this, but it might be a bit awkward.  If the configuration breaks up into multiple files, we could mash them all into one and store them in the blob, but that isn't as clean as keeping the documents separate.  This approach does have the benefit of keeping the signoff and the reasoning behind it together.
+
+## Assumptions
+
+#### Config is documents
+This seems like a solid assumption, but it might not be!  I am having a hard time imagining something else, but maybe there is something.
+
+#### We need high availability
+We believe that the tax system can't go down during tax season.  This seems like a really safe bet.  It can basically be spun down for months out of the year, but once the season starts, this has to be on all the time and working.
+
+#### We will have multiple application instances
+I feel like this is something we should plan for, but hey, maybe I am wrong!  I think that it would make sense for us to run several instances of the tax engine behind a load balancer so that we can make sure that it can handle the scale.
+
+#### Regular disk storage is unreliable/unworkable
+This assumption comes from not trusting file system watchers, and being concerned about disks in general.  It feels like a potential point of failure to hook a shared disk to a container (imagining both running in AWS).  There could be issues with the disk, with the connection to it, maybe it gets tampered with (not sure how)... I just trust databases more than a disk (I know they are on disks).  I like the automatic backup schedules, the order, and the ability to see and track what is going on.  This may just be a prejudice that I am bringing to the table.  I trust document stores more than just disk storage.  All of the pathing above could be used on a regular disk. 
+
+
+## Constraints
+- The medium should be protectable in some way, meaning not just anyone can write and read.
+- The medium should be available to many systems.
+- The medium can meet auditing requirements (reads and writes are tracked in some way).
+
+
+## Status
+Pending
+
+## Consequences
+TODO: fill this in when we start seeing consequences!
+
+
+
--- a/docs/adr/adr_configuration_management.md
+++ b/docs/adr/adr_configuration_management.md
@ -0,0 +1,33 @@
+# ADR: Configuration Management
+DATE: 07/05/2023
+
+## Background
+
+A Configuration Management Plan is required in order to maintain proper control over the software and infrastructure in use by DirectFile.
+
+This plan will be used by developers, operators, and security engineers and assessors in order to verify that the software and services in use have been properly vetted and authorized before being used. The plan must lay out policies and procedures for configuration management that speak to the needs of the Configuration Management Plan (CMP) Data Item Description (DID).
+
+## Decision
+
+The CM plan policies will be broken down into component stages for ease of consumption by the day-to-day engineers on the DirectFile team.
+
+1. Development. This file will define policies and procedures for integrating new configuration and updating configuration of application code, including adding new features and required CI/CD scans and Pull Request approvals. This should also include any details on software that must be installed on developer's laptops above and beyond the standard-issue IRS GFE.
+1. Infrastructure: This file will define policies and procedures for our infrastructure-as-code (IaC) implementation. This should also include details on how the baseline of deployed services is maintained, verified, and audited.
+1. Deployment: This file will define policies and procedures for deploying changes to various IEP environments. This should also include details on how the baseline of runtime configuration is maintained, verified, and audited.
+
+These files will be referenced by link to the established CMP DID to ensure each section of the CMP is covered by our policies and procedures. They should also be referenced from the full project's README.
+
+## Rationale
+
+The referential approach serves two purposes:
+
+1. Day-to-day usefulness. By storing the CM policies alongside the code and IaC implementations, we reduce the onboarding burden, and ensure all engineers know the policies they are required to follow in developing and deploying DirectFile.
+1. Accurate compliance. By utilizing the CMP DID as designed, we ensure that we speak to all aspects of IRS CMP processes. Additionally, by utilizing links to version-controlled documents rather than copying and pasting we ensure that our compliance documentation is accurate and up-to-date at all times.
+
+## Status
+
+Proposed
+
+## Consequences
+
+Each file must include some boilerplate to ensure future updaters know that certain questions from the CMP DID are being answered by the CM policies.
--- a/docs/adr/adr_database-migrations.md
+++ b/docs/adr/adr_database-migrations.md
@ -0,0 +1,59 @@
+# Database migration management
+
+Written: August 17, 2023
+
+## Background
+
+Direct File database schema changes are presently handled by Hibernate's auto DDL and run automatically at application start.  This is simple and easy for early development efforts and can handle many database migrations automatically.  Modifying an `@Entity`-annotated class is all that is necessary to make most schema changes, even destructive ones (depending on configuration).
+
+As Direct File matures and approaches testing and eventual production usage, we must be more careful about changes to the database schema and take steps to reduce the likelihood of unintended changes.  One way to do this is to integrate a tool that supports evolutionary database design.
+
+### Required features
+
+* Allow separation of database changes and code changes
+* Easily determine what changes will happen when migrations are applied
+* Allow database changes to be applied separately from application start
+* Easily determine the current migration revision of the database
+* Allows migration to a specified revision which may not be the newest
+
+### Evaluation criteria
+
+* Integrates with Java tooling
+* Integrates into code review process
+* Allows separation of privileges for database users
+* Ideally, already available in IRS artifact repository
+
+## Options considered
+
+While there are many tools available for managing migrations that could work for Direct File, there are two that rise to the top due to their extensive usage and ubiquity in Java-based applications.  They are [Liquibase](https://www.liquibase.org/) and [Flyway](https://flywaydb.org/).
+
+Each of these meets the requirements for use by Direct File. 
+
+### Liquibase
+
+* All migrations in a single file
+* There is a choice of file format (YAML, XML, etc) and individual migrations can be specified directly as SQL or in a database-agnostic format.
+* Easy to view all migrations and determine the order they will execute
+* Requires coordination between developers for concurrent development of migrations
+* Database changelog can get large and unwieldy over time
+* Already available in IRS artifact repo (both `liquibase-core` and `liquibase-maven-plugin`)
+
+### Flyway
+
+* Migrations split into multiple files
+* Each migration is SQL
+* Potentially easier to concurrently add migrations, but still requires coordination between developers for concurrent development of migrations
+* Possibly more difficult to follow the flow of migrations
+* Already available in IRS artifact repo (both `flyway-core` and `flyway-maven-plugin`)
+
+## Recommendation
+
+Liquibase
+
+Direct File is relatively small and should remain that way for some time.  As the system has begun to grow, additional purpose-specific small services have been added.  Those may have their own associated data stores, keeping the complexity of any single database limited.
+
+Liquibase's single-file migration changelog will make it simple to view existing migrations and the low complexity of our database will mean the changelog should not grow out of control.
+
+## Decision
+
+Direct File will use Liquibase.
--- a/docs/adr/adr_design-review.md
+++ b/docs/adr/adr_design-review.md
@ -0,0 +1,69 @@
+---
+# Configuration for the Jekyll template "Just the Docs"
+parent: Decisions
+nav_order: 100
+title: "Design & a11y reviews"
+
+# These are optional elements. Feel free to remove any of them.
+status: "Proposed"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Accessibility testing methods
+
+## Context and Problem Statement
+
+In order to work in an agile fashion and ensure code changes meet both design spec and our [accessibility target](adr_a11y-target.md), we should align on a consistent process to achieve this. As part of the [accessibility review](adr_a11y-methods.md) for each PR, the original developer will do a first pass of both design and a11y review. This DR covers how designers will be the second set of eyes on those user-facing changes.
+
+## Considered Options
+* Review each PR locally before merging
+* Review all merged PRs for the current sprint in the deployment environment at the end of the sprint
+* Review entire application in the deployment environment every other sprint
+* Review entire application in the deployment environment when designers are available
+
+## Decision Outcome
+
+Chosen option: "Review all merged PRs for the current sprint in the deployment environment at the end of the sprint", because it is the best we can reasonably do given our comfort with the tools and time available.
+
+### Consequences
+
+* Good, because it maximizes coverage
+* Good, because it does a decent job of preventing issues from sneaking by for more than a sprint
+* Bad, because it requires extra time to create a new issue for each problem discovered
+* Bad, because it requires extra time to filter Github PRs by date
+
+### Confirmation
+
+This decision is confirmed by all designers, as they will be doing this work.
+
+## Pros and Cons of the Options
+
+### Review each PR locally before merging
+
+* Good, because it maximizes coverage of changes
+* Good, because it prevents almost all issues from sneaking by
+* Good, because feedback can go directly into the PR
+* Bad, because it requires frequent intervention by designers who may not be comfortable with necessary tools
+
+### Review all merged PRs for the `{current | previous}` sprint in the deployment environment at the `{end | beginning}` of the sprint
+
+* Good, because it maximizes coverage of changes
+* Good, because it does a decent job of preventing issues from sneaking by for more than a sprint
+* Bad, because it requires extra time to create a new issue for each problem discovered
+* Bad, because it requires extra time to filter Github PRs by date
+
+### Review entire application in the deployment environment every other sprint
+
+* Good, because it maximizes coverage of the final user experience (end to end)
+* Neutral, because it does a decent job of preventing issues from sneaking by for more than two sprints, but can mean more work is needed to recover from it
+* Bad, because it requires extra time and cognitive load to check what was merged and identifying in the environment what to review. It's more likely in two vs one sprint that multiple PRs covered a certain page/feature, so filtering PRs by date is less useful.
+* Bad, because it requires extra time to create a new issue for each problem discovered
+
+### Review entire application in the deployment environment when designers are available
+* Neutral, because it could maximize coverage of the final user experience (end to end), but only as much as the designer has time for
+* Bad, because lingering design/a11y debt can be harder to pay off the longer it lingers
+* Bad, because it requires extra time and cognitive load to check what was merged and identifying in the environment what to review. It's higly likely that multiple PRs covered a certain page/feature, so filtering PRs by date is useless.
+* Bad, because it requires extra time to create a new issue for each problem discovered
+
+## More Information
+* The design review checklist is located in the PR template for the first option and in a [separate document](../design-review-process.md) for all other options.
--- a/docs/adr/adr_email_submit_error.md
+++ b/docs/adr/adr_email_submit_error.md
@ -0,0 +1,34 @@
+## Overview
+There are two main scenarios that, as it currently stands, the combination of our incident response (IR) practices and customer support (CS) motion do not support:
+
+        a) A taxpayer hits submit but an XMl validation failure occurs and they cannot submit at all. CS is unavailable (for whatever reason) and they give up attempting to submit with Direct File.
+        b) A taxpayer submits successfully, but an internal error occurs in the Submit application that blocks our ability to send their return to MeF. CS is not involved because, presumbly, the taxpayer never reached out to CS in the first case as they didn't run into any errors when hitting submit.
+
+Our inability to reach out to taxpayers proactively is primarily due to the fact that CS can only reach out to taxpayer who reach out to them first via eGAIN. 
+
+In the above scenarios, neither taxpayer knows 1) if the issue they ran into has/will be fixed; and 2) if so, that they should resubmit their return. As a result, they are effectively left in the dark and/or knocked out of the filing with Direct File, thereby forcing them to submit elsewhere.
+
+While our scale to-date (as of 2/26) has shielded us from the pain of these scenarios, or alleviated them altogether, we don't have a clear way to address them at this moment. Further, there is a very high likelihood that these scenarios will occur over the coming weeks and will become especially painful if/when the submission volume scales dramatically faster than our CS capabilities.
+
+
+## Proposals
+
+While we cannot change our CS motion to support this, we can enable a better product experience through how we notify taxpayers via email when an error occurs in our system.
+
+### Notify taxpayers when we cannot submit due to internal error
+We should notify taxpayers via email that there was an error submitting their return at the time of submission, i.e. XML validation failure or XML validation success but Submit app failure. The technical infrastructure to send emails on XML validation failure is already in place, we just need to create the HTML template for the email and actually send an email when XML validation failure occurs. Similarly, the requirements to send an email due to a post-submission error when trying to submit to MeF can be found here, and just need to be actioned.
+
+### Notify taxpayers when Direct File has deployed a fix that should allow them to resubmit their return
+We should also notify taxpayers via email that they are able to submit their return when we have deployed a fix into production that addresses the error that blocked them from submitting in the first place.
+
+
+### Proposed Technical Changes (Rough)
+
+1. Add two new HTML Templates to capture the two notification scenarios above, e.g. SubmissionErrorTemplate and ErrorResolvedTemplate. The templates should be added to the backend app, such that the ConfirmationService can process it, as well as the actual HTML template in the email app that is sent via email and rendered to the taxpayer
+2. When an XML validation failure occurs during submission  create a `SubmissionEvent` with an `eventType` of `error_xml` and enqueue a message from the backend to the email app to notify the user (naming of the eventType is TBD, might make sense to add a new `message` column and keep the eventType as `error`)
+3. Update the SQS message sent from submit -> backend (on the submission confirmation queue) to allow for an `error` status. If the ConfirmationService and SendService are properly configured as per #1 above, everything should flow seemlessly. Similar to #2, create a `SubmissionEvent` with an `eventType` of `error_mef` for each submission that failed to submit to MeF (naming of the eventType is TBD, might make sense to add a new `message` column and keep the eventType as `error`)
+4. Add a function to the backend that, when called, ingests a CSV of `taxReturnIds`, transforms the list into a SubmissionStatusMessage and calls ConfirmationService.handleStatusChangeEvent
+5. Once a deploy goes out that fixes the underlying issue, create a CSV with `taxReturnIds` of the affected taxpayers (both those who reached out to CS and those who did not) using Splunk queries
+6. Send this CSV to IEP and ask 1) their System Admin to run the command specified in #4; or 2)have them upload it to S3 and do something similar to the email allow list such that the function specified in #4 polls S3 and sends emails based off this polling. This second approach would require more state management but would possibly cut out the need for IEP to run commands and maybe obviate the need for a p2 to make this happen.
+7. Add monitoring in place to observe the emails being sent out accordingly
+8. [Alternative] We move the 'submitted' email to send only after we receive submission confirmation, not after we pass XML validation
--- a/docs/adr/adr_encrypting-taxpayer-data.md
+++ b/docs/adr/adr_encrypting-taxpayer-data.md
@ -0,0 +1,27 @@
+# Encrypting taxpayer data
+
+Date: 07/31/2023
+
+## Status
+
+Approved
+
+## Context
+
+Our system stores fact graphs containing sensitive taxpayer data including personally identifiable information (PII) and federal tax information (FTI). To mitigate the impact of a breach or leak, we want an approach for encrypting this data at rest. This approach should satisfy both our team's desired security posture and relevant compliance controls.
+
+To date, we have considered a client-side approach and a server-side approach.
+
+## Decision
+
+We will implement server-side envelope encryption for securing taxpayer fact graphs at rest. We will generate a per-user symmetric data encryption key and use that key to encrypt fact graphs before storage in the taxpayer data store. The data encryption key will be encrypted using a root key and the encrypted key will be stored next to the fact graph in the taxpayer data store. The root key will be managed by AWS's Key Management Service (KMS).
+
+## Consequences
+
+- The server-side approach reduces implementation complexity, allowing us to meet our desired security qualities and maintain timelines for pilot launch
+- We will need to implement additional mitigations to avoid information disclosure of plaintext fact graph data (e.g., through logging)
+- We aren't reinventing the wheel, and we can take advantage of industry-standard encryption functionality provided by AWS KMS
+- Plaintext fact graphs will be visible to the public-facing API gateway and other supporting services that sit between the web frontend and the Direct File API. Plaintext fact graphs will also be accessible to an administrator with the necessary decrypt permissions within KMS
+- If needed, fact graph data can be migrated server-side
+- Future layers of protection (e.g., message-level encryption) can be added as our threat model matures
+- We will need to identify a stand-in for KMS in local environments and put any KMS-specific code behind an abstraction layer
--- a/docs/adr/adr_error-messages.md
+++ b/docs/adr/adr_error-messages.md
@ -0,0 +1,80 @@
+---
+# Configuration for the Jekyll template "Just the Docs"
+parent: Decisions
+nav_order: 100
+title: "Error message types"
+
+# These are optional elements. Feel free to remove any of them.
+status: "In review"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Accessibility testing methods
+
+## Context and Problem Statement
+
+We use error messages across the application, many of which follow certain patterns. We also have a challenge of translating all content, including those messages, which increases for each unique string. This decision record seeks to resolve both issues.
+
+YYY and XXX have proposed an initial set of [patterns](#patterns) below guide that could be used for either option 1 or 2 below. Either way, those exact patterns should be considered independent of this decision itself, just presented as a clear example of the outcome of this decision.
+
+## Considered Options
+
+1. Create a set of patterns for all fields to use exactly
+2. Create a set of patterns to use as a guide for many unique strings
+3. Use any string we want in each error messages
+
+## Decision Outcome
+
+Chosen option: "Create a set of patterns for all fields to use exactly", because it minimizes translation cost, maximizes engineering efficiency, and minimizes content review.
+
+### Consequences
+
+* Good, because it maximizes consistency in implementation
+* Good, because only the source strings need content review and translation work
+* Good, because all instances of the string can be updated in one code change
+* Good, because it sets a consistent content UX
+* Neutral, because some fields may still need custom strings
+* Bad, because some strings will be ok but not great for some instances
+
+## Pros and Cons of the Options
+
+### Create a set of patterns to use as the sole source for all fields
+
+* Good, because it maximizes consistency in implementation
+* Good, because only the source strings need content review and translation work
+* Good, because all instances of the string can be updated in one code change
+* Good, because it sets a consistent content UX
+* Neutral, because some fields may still need custom strings
+
+### Create a set of patterns to use as a guide for field-specific strings
+
+* Good, because it sets a clear expectation for implementation
+* Neutral, because similar/same strings may be easily discovered and grouped for content review and translation work
+* Bad, because every string needs its own translation work
+* Bad, because every string needs engineering work to update it
+
+### Use any string we want in each error message
+
+* Good, because it maximizes flexibility for initial implementation
+* Bad, because every string needs its own content review and translation work
+* Bad, because every string needs engineering work to update it
+* Bad, because it may be confusing if the same type of error has different formulations across fields
+
+## Patterns
+
+| Type           | Formula                           | Examples                                                                                                                                                                                         |
+| -------------- | --------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| Required       | This is required                  | [Same]                                                                                                                                                                                           |
+| Match/re-enter | This does not match               | [Same, and the field label should indicate what it matches to]                                                                                                                                   |
+| Length maximum | Must have fewer than [x] [y]      | Must have fewer than 8 numbers<br>Must have fewer than 8 numbers and letters                                                                                                                     |
+| Length minimum | Must have at least [x] [y]        | Must have at least 16 numbers                                                                                                                                                                    |
+| Length target  | Must have exactly [x] [y]         | Must have exactly 10 numbers                                                                                                                                                                     |
+| Length range   | Must have between [x] and [y] [z] | Must have between 8 and 16 numbers                                                                                                                                                               |
+| Amount maximum | Must be less than [x]             | Must be less than 100                                                                                                                                                                            |
+| Amount minimum | Must be at least [x]              | Must be at least 10                                                                                                                                                                              |
+| Amount range   | Must be between [x] and [y]       | Must be between 10 and 100                                                                                                                                                                       |
+| Date maximum   | Must be [x] or earlier            | Must be December 31, 2023 or earlier;<br>Must be today or earlier;<br>Must be tomorrow or earlier;<br>Must be yesterday or earlier                                                               |
+| Date minimum   | Must be [x] or later              | Must be January 1, 2023 or later;<br>Must be today or later;<br>Must be tomorrow or later;<br>Must be yesterday or later                                                                         |
+| Date range     | Must be between [x] and [y]       | Must be between January 1, 2023 and December 31, 2023;<br>Must be between January 1, 2023 and today;<br>Must be between today and tomorrow;<br>Must be between yesterday and tomorrow            |
+| Allow list     | Must have only [x]                | Must have only English letters;<br>Must have only numbers;<br>Must have only English letters and numbers;<br>Must have only numbers, parentheses, and apostrophes;<br>Must have only !@#$%^&\*() |
+| Example        | This should look like [x]         | This should look like username@website.com;<br>This should look like 123-456-7890                                                                                                                |
--- a/docs/adr/adr_resubmission.md
+++ b/docs/adr/adr_resubmission.md
@ -0,0 +1,43 @@
+Date: 12/23
+### Introduction
+
+The concept of a tax return has recently evolved with the introduction of the concept of a resubmission. Whereas in the one-shot submission case we could assume a linear progression of data, with resubmissions we are faced with non-linear and cyclical progressions of tax return throughout the various services, effectively repeating the submission process until we receive an accepted status.
+
+### Objectives
+
+- Define a database schema for the backend database that enables **painless, scalable and future-proofed** tracking of the entire submission lifecycle at the database level with **minimal future migrations and backfills**.
+- Minimize the number of unnecessary changes (migrations, backfills) 
+- We don't fundamentally know what we will need from a data storage perspective at this moment in time, so we should approach it from the standpoint of 'better be safe than sorry' for the pilot. We might store more data than we need if everything goes well, but if things don't go well we will be happy that we have the data laying around.
+
+### Why make more changes to the `backend` schema
+
+Upon further requirement gathering and edge cases cropping up, it appears that the schema changes we initially agreed to previously isn't going to scale well in the long term. Specifically, I don't believe that, based on said the #3562 schema changes, we could reliably reconstruct the history of a taxpayers journey through DF if they have to resubmit their return given our current models or describe what events happened at what time in said journey. Reconstructing the history from the data/backend perspective is a business requirement in my mind (for a variety of reasons) and should be solved pre-launch. As a result, I think we need to evolve our data modeling of the `TaxReturn` object and its relations a bit more to capture the domain we are after.
+
+To be clear, this is less driven by our need to enable analytic at the database level (that is covered elsewhere). Rather, it is around modeling submission, tax returns, and status changes in a way that uplevels our observability into a taxpayer's journey through DF from a backend perspective. Even though we will likely not have access to production read-replicas for the pilot, we should still have an observable system with clean architecture.
+
+### Why not make the minimally viable set of changes now and wait until after the pilot to make more changes
+
+1. _As soon as we start populating our production database, the cost of making changes in the future is orders of magnitude higher than making them now._ In other words, if we don't make changes now but realize we need to make changes in June 2024, we now have to worry about taxpayer data loss as a consequence of executing migrations and backfills incorrectly.
+2. In prod-like environments, I presume that we should not, if not cannot, pull taxpayer data from S3 (XML) or decrypt encrypted data (factgraph) to answer questions about a taxpayer's journey through DF. This would violate compliance firewalls on many levels and I assume that we should not rely on these avenues.
+3. Aside from wanting better observability as we roll out in Phase 1-2 to give us confidence, from an incident response standpoint we need a way to break the glass and run SQL queries against the prod db if we are in trouble, even if that means asking a DBA on EOPS/IEP to make the query for us. In the event/when that this happens, those queries should be performant, simple and easier for a larger audience to understand without deep technical knowledge of our system. As described below, cleaner data modeling with a limited number of mutable fields makes analysis much easier
+4. Our analytics integration (with whatever acronym'd platform we are going with these days) will land TBD in Q1/Q2 2024, assuming it lands on time. This means that if we need to run baseline reporting before we can access the analytics platform, we will need to do it at the database level based on whatever schema we decide on.
+5. We haven't actually made meaningful changes to the backend database that populate data into tables in a new way. This means that any future changes to the codebase or database are still, effectively, net new and wouldn't require us to refactor work we just implemented.
+
+### Current Schema Design
+
+Our original implementation of the backend schema model system only contained the `TaxReturn` model. See Original Schema image in description.
+
+The most recent iteration proposed in #3562 used a combination of `TaxReturn` with a new 1:M relation to an event-tracking model `TaxReturnSubmission` as follows (the mental model of how this works is described here ("Epic: Resubmit Return After Rejection"). See 'Interim Schema originally agreement upon in resubmission epic scoping' image in description.
+
+After some more research into resubmissions and discussions with various team members working on related work, I believe that the above combination of `TaxReturn` and `TaxReturnSubmission` is still insufficient in themselves to achieve what we want. While we could handle both of these concerns at the `TaxReturnSubmission` level, this would result in 1) a model that begins to drift away from its original purpose, namely to track status change events through DirectFile; 2) a model that needs to replicate a bunch of data each time we want to create a new event (such as facts and return headers) without any discernible value; and 3) a model that begins to combine mutability (session time tracking) with immutability (status logging). For instance, if a taxpayer spent 20 minutes on the original submission and 5 minutes correcting errors and resubmitting, we should be able to attribute 20 minutes to the original submission and 5 to the correction, rather than having a single 25 minute field that cannot be attributed to the original or correction session. We cannot do this right now.
+
+
+### Schema Proposal
+
+See Proposed Schema in description
+
+We move to a schema definition that more neatly decouples mutable data and immutable data by:
+
+1. Remove the `submit_time` column from the `TaxReturn`.
+2. Rename `TaxReturnSubmission` to  `Submission` but maintain the M:1 relationship to `TaxReturn`.  The `Submission` table is meant as a map between a single tax return and all the submissions that comprise the journey from creation to completion (ideally acceptance). Each `Submission` for a given `TaxReturn` maps to a snapshot of various data, most importantly the facts and return headers, when the /submit endpoint is called.As a result, we should add a `facts` column to the table, similar to the data definition `TaxReturn`. The key difference for these two fields is that the data stored on the `Submission` table is  immutable and represents our archival storage of the facts and returns headers at a single moment in time (submission). When a new submission ocurs, we create a new `Submission` record to track this data, and copy the related `TaxReturn's` `facts` onto the `Submission`.
+3. Add a `SubmissionEvent` model that FKs to `Submission`. The sole purpose of the `SubmissionEvent` model is to track status changes in our system, thereby function as an append-only audit log of events (effectively a write ahead log that we can use to play back the history of a submission from the standpoint of our system). Any immutable data corresponding to a status change event should live on this model. The statuses will be broader than MeF statuses alone.
--- a/docs/adr/adr_user_supplied_info.md
+++ b/docs/adr/adr_user_supplied_info.md
@ -0,0 +1,124 @@
+# ADR: Data Storage 
+DATE: 12/21/2022
+
+## Background
+
+Our system will utilize and produce several kinds of data in its normal operation, each of which need to be considered when discussing storage.  The categories are as follows:
+1. **User Information**: We plan to use one or more third party authentication providers to provide verification of a user's claimed or proved identity.  Those users will have preferences and personal information that our system will need.  Although some of that information may be stored by the authentication provider, we will ask for it ourselves to avoid IAL2 identity proofing burdens. 
+2. **Unfiled Tax Information**: The goal of our system is to produce or file a tax return.  To get to the point of filing or producing a tax return, a user will be asked a series of questions that will drive our system's ability to fill in the form(S) provided by the IRS.  The answers to these questions must be stored and updated.
+3. **Filed Tax Information**: When the user is ready, our system must offer them a method by which they may file their tax return.  This could be either through the MeF or by mailing it a PDF.  Our system must maintain the artifacts it creates for the period required by the agency.  They should also be stored with information that would allow us to replicate the system they were run on, meaning version numbers for all software and schemas involved.
+4. **Tax Software Schema Information**: The system asks a series of questions and collects information from the user.  The questions asked and the structure by which the data is understood must be stored and versioned.  In the initial version of the product we used configuration for both the question flow and the facts that were answered by the questions.  We imagine something similar will be necessary.
+5. **Permissions**: A user may have access to multiple years of returns, as well as returns filed by someone else on their behalf.  In the first year, it may be required for multiple people to use the same account to file jointly.  After the first year however, we may want to allow multiple accounts to view the same return and sign independently.  We can also imagine a scenario where someone helps another person file taxes, or maybe needs access to previous years for another person in order to help them file this year.  The implication is a sort of permission system which is out of scope for this document.
+
+These general categories constitute the basic data that our system will use and generate.  The decisions on how to store these pieces of information and how we came to those decisions are the subject of this document.
+
+## Decision 
+
+The decisions here are going to be put in technology neutral terms, referring to general categories of systems rather than by specific brand names.  Each of the categories above are handled differently in our system.
+1. **User Information**: We expect that a minimum of data will come to us from the authentication provider. We will need to ask the user for most of the necessary personal information and store it ourselves. We will use a relational database table to store the ID from the authentication provider, the type of authentication provider (i.e. the name of the service), an ID specific to our system, and the necessary personal information.  User preferences will be stored in the relational database as either a blob or as a table.
+2. **Unfiled Tax Information**:  A blob in the system's relational database is our preferred approach to handling information on which the user is actively working.
+3. **Filed Tax Information**: A document storage system outside of the system's relational database is how we feel that this data should be stored.
+4. **Tax Software Schema Information**:  This can be stored in either a document store or in the relational database.  It is slightly preferred to keep the schema in the relational database as it will allow for metadata storage and easier querying.
+5. **Permissions**: The permissions on tax returns will be stored in the relational database.
+
+## Rationale 
+
+#### User Information
+
+To receive personal information from authentication providers, we would have to burden users with IAL2 identity proofing requirements. In order to avoid this, we will ask users to provide the information, and store it ourselves.
+
+By storing information about the user that the auth system also stores, we will make it difficult for the user to anticipate behaviors between the two systems.  One can imagine the frustration scenario of updating your information in our system with the expectation that it will be updated in the auth provider (why would I need to update it twice?).  The opposite direction is equally annoying.  Why didn't you just pull my information from the auth provider rather than making me update it in your system as well?  
+
+We will need to mitigate these impacts with careful messaging and communications.
+
+Our use of a database table to store the user information linkage into our system is partly out of necessity and party convenience.  We would like a unified ID internal to our system that connects users to tax records.  Also, we don't want to be reliant on any particular third party auth provider.  Beyond these two concerns, we would also like to not lock the user into a specific email address.  There are a few ways to meet these requirements, like, for example, linking the records by PII.  We felt that using PII would not be in the best interest of the user.  There are conditions under which a users most sensitive PII, like their social security number, may change, whereas the ID they were assigned in the third party auth system will not change.  The third party ID approach will also make it far more difficult to identify a user by their user record.  
+
+The user preferences are not of substantial concern.  They do not require encryption, and they do not represent any real load on the system.  Most likely they will be a set of key value pairs that represent how a user would like to see the site.
+
+**Key Ideas**
+- We don't want to burden users with meeting IAL2 identity proofing requirements. 
+- We don't want to lock our users into an email.
+- We don't want to lock ourselves into a third party auth vendor.
+- We don't want to use PII which might change to identify records.
+- We would like an internal ID to identify users and their records that are unique to our system.
+
+#### Unfiled Tax Information
+
+There are several complexities that storing the active tax data in a blob solves.  The first is that it doesn't lock us into using either snapshots or deltas.  In a document storage context it would be easier to wipe out the old document and write a new one each time the user updates.  This could be countered with an event/command sourcing pattern, which would then require us to validate each message is received, processed and stored.  The easiest way to allow for either update concept is to use a JSON blob.  If we are given new key value pairs, we simply have to apply them to the blob.  If we want to wipe out the whole blob and rewrite it, we easily can.
+
+The next important value is atomicity.  Anything we do in the database can be done in a transaction if necessary, allowing us to roll back if there are any problems.  The same could not be said if the data is stored in a document storage system elsewhere.  Imagine, for example, that we would like to store the last updated time on the document.  We could perform that update in our database, and then attempt to write the document to the document storage system only to have it fail.  We now have a belief in our database, that the record was updated at time X, that is false.  This isn't good.  This same action could create broken links in the system.
+
+Storing the information outside of the database requires extra hops.  If the information is linked but stored in another document storage system, we will still have to go to the database to read the link and find the document.  This means that rather than doing one database read we will have to do one read on the relational database and one on the document store.  From an efficiency standpoint, this is a bad idea.
+
+The data we are planning to store is not so large as to cause a problem for any common RDBMS provider.  There won't be any particular slowdown or read issues associated with reading the data out of the database.
+
+The final point is around single points of failure in the system.  We can architect around having the database be a single point of failure in our system: there are a lot of tools to avoid that being a problem (like replication/failovers for example).  If our database system does go completely down, our system will go down as well.  Introducing another system doesn't solve that problem, and it in fact compounds the problem.  Now we are reliant on two systems not going down rather than one.  The likelihood of either going down is fairly low, but we have doubled our chances of trouble if we move any "active" data out of our database.
+
+**Key Ideas**
+- No lock in with respect to snapshot vs deltas
+- Relational database operations are atomic, and can be rolled back on failure
+- Storing data in multiple systems is less efficient as we will have to jump multiple times to get the data.
+- Our data is not too large for efficient database storage.
+- We already have the database as a failure point, why add more?
+
+#### Filed Tax Information
+
+The information generated around filing are PDF(s) showing the tax form as we have filled it out, the information we sent to the MeF, a copy of the data used to generate the form, and some information about the version of the system and the schemas used to file.  This is read-only information meant to be a record of what was submitted to the IRS (which may also be helpful if we ever have to audit our system).
+
+The first, obvious point is that these are mostly documents.  The PDFs in particular are definitely files that can and should be stored as files.  We don't know upfront the total number of documents that may be required.  We can imagine that we will smash all the pdfs potentially required into one large pdf, but even if we do that there could be other documents that are necessary to retain.  The uncertainty around the number and the size of these documents makes a document storage system a logical choice.
+
+The fact that they are read only also helps inform the decision.  Having them in a separate system that doesn't have a write capacity in the normal functioning of the system, except on filing, makes them safer from potential bugs that could cause a write to the wrong file.  One could imagine for instance, a user is actively working, files, and then goes back to change something because they want to see if it will be reflected in their taxes.  It is unlikely, but a bug could be introduced that updates the field data in the database.  The extra level of protection is not necessary, but is an added benefit.
+
+**Key Ideas**
+- The artifacts are documents and should be treated like documents.
+- We don't know how many potential artifacts we are dealing with.
+- These are read only, and having them separated helps to make that more clear and avoids potential bugs.
+
+
+#### Tax Software Schema Information
+
+Either solution is acceptable in this context.  We have a slight preference to storing it in the database  so that some meta information can be stored with the schema.  In the future we may want to change logging, marking the who, what, when, where, and why.  It would be nice if we could also have the deltas of the change stored in a field nearby the schema. 
+
+If there is ever a need to query the schemas, maybe to show the change over time of a specific fact, it would be easier if they were in the relational database.
+
+**Key Ideas**
+- Either system would work.
+- There are some advantages to storing the information in the relational database
+- Able to add meta info to relational database table easier
+- If we ever want to perform a query on this information the relational data will make that easier
+
+
+#### Permissions
+
+The permissions are applied to both the filed tax information and the active tax information.  Because the active tax information is stored in the relational database, it is logical to store the permissions there as well.  It wouldn't make sense to store the permissions elsewhere.  The system relied on the relational database for general operation.  If the document storage system went offline, it would be possible to continue operation, either without filing or with a queued filing system in place.  If the permissions were in another location we would lose this property.
+
+**Key Ideas**
+- Permissions apply to the data in the relational system and the document store
+- The system should still be able to operate without the document store, which means that permissions should be in the relational database
+- Why would we store it away from the user's active data?
+
+
+
+### Rejected ideas
+
+
+### Assumptions
+
+- User preferences are not of significant concern in terms of data storage.
+
+#### Third party user IDs are 
+
+
+
+### Constraints
+
+- We need a storage system that meets any SLO we may have.
+- The system has to be widely understood and available.  It shouldn't be something that contractors won't know.
+- The system can scale without a bunch of management.
+
+## Status
+Pending
+
+## Consequences
+- Our UI will have to communicate to users that data entry does not update their personal information in a third party auth system.
+
--- a/docs/adr/adr_yaml_translations.md
+++ b/docs/adr/adr_yaml_translations.md
@ -0,0 +1,161 @@
+---
+parent: Decisions
+nav_order: 100
+title: "Yaml Translations"
+
+# These are optional elements. Feel free to remove any of them.
+status: "Proposed"
+date: "20230906"
+---
+<!-- we need to disable MD025, because we use the different heading "ADR Template" in the homepage (see above) than it is foreseen in the template -->
+<!-- markdownlint-disable-next-line MD025 -->
+# Yaml Translations
+
+## Context and problem statement
+
+The translation files have 3 use cases
+
+* The frontend code dynamically populates screens with the content of the translations. The structure needs to work with react-18next.
+* The content will need to be written by a content team, and possibly taxperts, in the future.
+* The content will also be provided as a set of meaningful strings for the translation teams to translate. They will return translations which will then need to be reintegrated into the codebase.
+
+As this will be consumed by non-engineers, the current strategy of allowing tags within strings is becoming unwieldy.
+
+The translation team has expressed concern with the html style strings and we decided to find a less challenging and less error-prone format.
+
+## Desirable solution properties
+
+We'd like the format to 
+
+1. Be easily consumed by our frontend and generate correct html
+2. Be easily editable by the backend team currently writing content
+3. Be easily editable by the content team when we transition to them editing
+4. Split into strings that preserve enough context for translation teams to translate
+5. Do not contain too much complexity of tags within the strings (`<strong>` and `<i>` ok)
+
+## Considered options
+
+1. Don't change anything, use json with all tags in strings
+2. Use json but represent html as a structured json object
+3. Use markdown and convert to json
+4. Use yaml and represent html as a structured object
+
+## Decision Outcome
+
+Chosen option: Use yaml and represent html as a json object
+
+Reasoning: Yaml is designed to be easy to read and understood by humans, which will help our translation/content files be more editable and consumable for everyone. By introducing the structure, we can cleanup the proliferation of tags in our strings and make the translation process go more smoothly. 
+
+Yaml is a superset of json and can represent all our current json 1:1. It's also a format that can be ingested by react-i18next.
+
+
+Consequences:
+
+* We need to convert the json to yaml, and break up the html strings into structured yaml, which will be used to generate the content.
+* Editors of en.json will have to write yaml.
+* Translators will get simplified strings with minimal html.
+
+## Pros and Cons
+
+### Don't change anything, use json with all tags in strings
+
+```
+    "/info/you-and-your-family/spouse/intro": {
+      "body": "<p>You'll need to know:</p><ol> <li><strong>If you're considered 'married'</strong><p>Many long-term partnerships resemble marriage, but for tax purposes, there are guidelines for what is and isn't considered married.</p></li> <li><strong>If you and your spouse want to file a return together</strong></li><p>People who are married can file a tax return together (jointly) or file separate returns.</li> </ol>"
+    },
+```
+
+#### Pros
+
+* No work, it's what we have already.
+* React-i18next supports json out of the box.
+
+#### Cons
+
+* This is getting harder to read and edit as we continue to insert more content into this file.
+* Translators will struggle with the nested tags and likely make errors which will slow down translation updates.
+* It's not possible to break down these strings easily into meaningful snippets to send the translators, that will convert back into json.
+* Content team might also struggle with editing this format.
+
+### Use json but represent html as a structured json object
+
+```
+  "body": [
+    "You'll need to know:",
+    {
+      "ol": [
+        {
+          "li": [
+            "<strong>If you're considered 'married'</strong>",
+            "Many long-term partnerships resemble marriage, but for tax purposes, there are guidelines for what is and isn't considered married."
+          ]
+        },
+        {
+          "li": [
+            "<strong>If you and your spouse want to file a return together</strong>",
+            "People who are married can file a tax return together (jointly) or file separate returns."
+          ]
+        }
+      ]
+    }
+  ]
+```
+
+#### Pros
+
+* React-18next supports json out of the box (however we do have to dynamically generate the DOM elements).
+* It can be flexible and we can also limit what tags we accept as we dynamically generate the DOM elements.
+* Can easily programmatically be broken up into meaningful snippets for translators.
+* Limited tags in the strings themselves, so less error-prone.
+
+#### Cons
+
+* Content is hard to read and write because of heavy nesting.
+* This is challenging as an engineer, will be very challenging for a non-dev to edit with errors.
+
+### Use markdown and convert to json
+
+#### Pros
+
+* Translators showed enthusiasm for this option.
+* Can easily programmatically be broken up into meaningful snippets for translators.
+* Limited tags in the strings themselves, so less error-prone.
+* Content is easier to write for non-devs.
+* Looks readable too.
+
+#### Cons
+
+* Not supported by react-i18next out of the box.
+* We will need an extra step to get the markdown into a format for ingestion by react-i18next.
+* Markdown is less strict than html and it's possible the markdown->html generates a larger variety of generated html than we intend to support.
+* We need to convert the whole en.json into markdown.
+* Opportunity for errors still exists when writing markdown correctly.
+
+### Use yaml and represent html as a structured object
+
+```
+body:
+- "You'll need to know:"
+- ol:
+  - li:
+    - "<strong>If you're considered 'married'</strong>"
+    - "Many long-term partnerships resemble marriage, but for tax purposes, there are
+      guidelines for what is and isn't considered married."
+  - li:
+    - "<strong>If you and your spouse want to file a return together</strong>"
+    - "People who are married can file a tax return together (jointly) or file separate
+      returns."
+```
+
+#### Pros
+
+* It can be flexible and we can also limit what tags we accept as we dynamically generate the DOM elements.
+* Can easily programmatically be broken up into meaningful snippets for translators.
+* Limited tags in the strings themselves, so less error-prone.
+* Much more readable and editable in it's raw format for both content teams and engineers.
+
+#### Cons
+
+* We need to convert the whole en.json into yaml
+* React-18next doesn't support yaml out of the box (however yaml->json is a well defined conversion).
+* Opportunity for errors when writing the structure and getting the format and indent correct.
--- a/docs/adr/data-import-redux.md
+++ b/docs/adr/data-import-redux.md
@ -0,0 +1,36 @@
+# Data Import Redux
+Date: 5 Nov 2024
+
+## Context
+To facilitate the correctness of the frontend data import system, we will use redux to coalesce data from APIs.
+
+Data Import has many lifetimes that are independent of traditional timeframes from the app. Data fetches get kicked off, data returns later, data gets validated as ready for import all independently of user action.
+
+Front end screen logic is much simpler to write as a function of state rather than having to manage separate lifetimes in multiple places such as context, component state etc.
+
+## Decision
+
+Data Import will use the Redux Library to make it easier for us to manage changing state related to incoming data from the backend.
+
+We will write logic in redux to transform data as it comes in so that the frontend knows when to use it.
+
+
+# Alternatives Considered
+
+## React Context and Fact Graph
+
+- Lack of chained actions - because we expect data from different sections (about you, IP PIN, w2) to come in at different times, we need to be able to chain /retry fetches and coalsce them into one structure. Redux makes this much easier than alternatives considered.
+- Limits blast radius - with data coming in and out while people are on other screens, redux provides much better APIs to avoid rerenders on API calls that are not relevant to the current screen.
+
+# Other Libraries
+
+I looked briefly at Recoil, MobX, Zustand and Jotai but they all seemed geared at simpler apps. Some of Data Import's initial features (e.g. knowing if more than a second has elapsed during a request) are much easier to impliment in redux based on my prototyping. Secondly, Redux is so well used that nobody ever got fired for using redux :P
+
+# Future Uses
+
+Redux has a few key advantages over things we have in our codebase right now:
+- Automatically manages the render cycle more efficiently (important as our react tree grows ever larger)
+- Proven at scale with complex application state
+- Well known in the industry with a good tooling eco system
+
+That being said, there are no current goals to rewrite anything in Redux. Rewriting core components of our application would take a lot more prototyping and exploration than is being done as part of this process.
--- a/docs/adr/data-import-system.md
+++ b/docs/adr/data-import-system.md
@ -0,0 +1,154 @@
+# Data Import System Structure
+Date: 7/3/2024
+
+## Acronyms
+- DF: Direct File
+- DI: Data Import
+- ETL: Extract, Transform, Load
+- IR: Information Return
+- TIN: Taxpayer Identification Number
+
+
+
+## Context
+To facilitate the correctness of the return and speed of data entry, Direct File (DF) should present the user with information that the IRS has about them that would be relevant to their return.  This could include information returns (1098s, 1099s, W2s, etc), biographical information, and anything else that would help the user accurately file their return.  To gather this information the DF application set needs one or more applications making up a Data Import (DI) system that is responsible for gathering the information, either on demand or prefetched, and providing it out to other systems upon request.
+
+This ADR is an argument for a specific application layout for the DI system.  It doesn't specify the interface between the application, any app and the Backend App, nor does it specify which IRS systems should be included in DI or an order of precedence for those systems.  Any system names used are for example purposes and shouldn't be taken as a directive.  The DI system accesses information considered controlled by the IRS which is stored either in AWS or on prem servers.  
+
+## Decision
+
+The DI system should be a single app that checks IRS systems on request for the first year.  In the second year, when we have a larger population of users it might be useful to precache the imports for known users and store them for when the user returns. It is also doubtful that we could get an encrypted storage mechanism for such sensitive data approved in time for launch.  This question of precaching should be revisited next year.
+
+
+### Application Structure
+
+#### Input 
+
+The input to the system should be a queue that accepts messages from the backend, provided on return creation.  The message should contain the TIN of the user and any other information required by the IRS internal interfacing systems (modules to the DI system).  This message should be given some mechanism to be correlated back to a tax return, like the ID of the return for example.  It should also have the Tax Year as part of the message so that we only gather information relevant to this year.  There will be a lot more than this year's data in some of these systems.  We may also need that to support late returns in the future.
+
+##### Simple Example Message (assuming use of message versions in DF)
+```
+{
+  Tin: 123456789
+  OtherInformation: kjsjdfjsdf // don't add that... this is a place holder because we don't know.
+  TaxYear: 2024
+  ReturnId: 2423bf16-abbf-4b4a-810a-d039cd27c3a0
+ }
+```
+
+#### Layout
+
+We don't know the eventual number of systems we will have to integrate with, and the system should be set up with growth in mind.  The base structure is a spoke-hub, with the hub being a spooler of information and the spokes being specific interfaces to internal IRS systems.  When an input message comes in the hub provides all of its spokes with a query information unit that every spoke will need to perform its lookups, likely this will start as only a tin and tax year, but it may expand later.
+
+##### Example QueryInformation 
+```
+{
+  Tin: 123456789
+  TaxYear: 2024
+}
+```
+	
+Each spoke will be given an interface type and will be added to the available DI systems via configuration. On application start we will read in the configuration and only instantiate DI modules (the spokes) that are requested. 
+
+##### Example DataImportModuleInterface 
+```
+public interface DataImportModule {
+  UserData Query(QueryInformation information);
+}
+```
+	
+For the first year, all of the calls in the modules should be subject to a timeout of 5 (configurable) seconds.  This is because we don't know how long some of these calls will take (in cases where they don't have information and generally), and we don't want to stall the user waiting for data.  We shouldn't cancel the call itself, but should rather send back what we have from the spooler, with the idea that we can send an update when we get it back.  This implies that the timeout is in the spooler, and the spooler keeps a record of previous calls for some length of time before disposing of them.  For the first pass most of the DI modules will perform rest calls to external systems.  When they receive information back from their system they should ETL and categorize that information into some basic concepts, rather than leaking where the data came from (n.b. it might one day matter where the data came from, but we won't worry about it for this year).  The below example is what the object may look like.
+
+#### Example UserData 
+```
+public class UserData {
+    UUID TaxReturnId { get; set; }
+    // this could be useful for negotiating conflicts if a precedence can be established.
+    // DataImport for the Hub, otherwise the name of the system the data came from (totally optional)
+    String SystemId { get; set; }
+    // 1 for DI modules, aggregated for the Spooler (hub).  If this equals
+    // the number of configured systems we are done and should send.
+    int SystemDataCount {get; set; }
+    // biographical information includes name information, dependents, spouse data 
+    // anything relating to a human
+    Biographical BiographicalInfo { get; set; }
+    // PreviousReturnData includes information about last year's return
+    // this information will be very valuable, but some things will not match
+    // this year.  It is helpful to have these separated out.
+    PreviousReturnData PreviousReturn { get; set; }
+    // Information returns contains collections of the various types of IRs
+    // that we support.  These are general categories like 1099 and W2, not like
+    // the types themselves.  These supported types should be configurable.
+    InformationReturns InformationReturns { get; set; }
+    // All non-supported IRs should go in here.
+    // We should assume that if anything makes it into here that 
+    // Direct File isn't the correct choice for the user and they should
+    // be informed of that fact.  This could be things like K1s or advanced
+    // cases that we don't support.
+    // These could just be the names of the forms themselves (like we don't
+    // have to map the form itself)
+    OtherTypes OtherInformationReturns { get; set; }
+}
+```
+
+The above example implies several things:
+1) This makes a pretty reasonable interface to the Backend app.  It contains all of the information required and is in a pretty usable form.
+1) There are cases we don't support, and we have to reflect that.  It will be nice for the user to be kicked out before they waste a bunch of time.
+1) Each of these classes will need a method to merge information together or be collections.  It makes the most sense to do a best guess merge for conflicting information.  There are a few reasons for this, namely that we are offering information that needs to be validated by the user and displaying multiple sets of information to a user will be very difficult/confusing.
+1) Last year's return is categorically different than IR information pertaining to this year
+1) We will have to merge together multiple UserData to get to a single UserData that will be passed to the backend
+
+With the DI modules generally covered (they have an interface, they make a rest call, they can be configured, they all send back a version of the same object), the focus now turns to the hub, which is just going to spool this data and reply back on a queue.  When the spooler (the hub) receives a request it will kick off all the configured DI modules and start a timeout timer.  It will store in memory the TaxReturnId associated with the UserData that will be used to store that user's data.  As the calls come back (assuming they are done async here) they will have the TaxReturnId property that will allow them to be correlated with the correct UserData.  A call will be made to the mainUserData.merge(UserData data) that will merge current and new UserDatas together (follow general rules of encapsulation).  If the timeout timer goes off the spooler will enqueue a message on an SQS queue returning the current state of data to the backend, but it will not remove the key value pair from the in memory collection unless all of the systems have returned.  If all of the systems have returned (described in the above example) then enqueue the message.  The user data should either be put to a distributed cache and linked in the enqueued message, or just put in the enqueued message depending on how big these UserData objects become.
+
+##### After timeout:
+We may have valuable information come in after the configured timeout.  We don't want to lose that data, and as such we shouldn't remove a user's data from memory until all systems have returned.  We should enqueue a message each time we receive new UserData after the timeout.  The UserData class should contain its own merge logic and should be made available to all projects (meaning it should be in our library).  The backend can decide what to do with these updates.
+
+#### Summary
+DI Modules are interfaced pieces that talk to external systems.  They store their information in UserData objects that can be merged together.  DI Modules are configured to be running or not.  The Spooler is the thing that receives messages and kicks off DI Modules.  It also tracks all of its current TaxReturnIds and the data returned from the DI Modules.  When the timeout timer goes off or all DI Modules return the Spooler will enqueue a completed message.  If more data comes in after the timeout, more messages will be enqueued.
+
+n.b. This pattern could be used in either a precaching or on demand system.
+
+
+## Rationale
+
+There are a few ways one could solve this problem.  For example, each DI Module could be its own microservice.  This would be more in line with a general microservice architecture.  We would still require some spooling agent to take all their information and be responsible for merging it.  The main reason we don't follow this pattern is that we get very little benefit for the trouble.  We would have more waiting, more messaging overhead, and more potential for lost data without any commensurate gain.  If the number of these systems grows beyond a certain level it may become valuable to reconsider this position (or if the complexity of the system grows beyond this basic concept).  The simplest and fastest (at this scale) approach is to create one small application that handles all of these, and runs its own child jobs in the form of the DI modules.  We will know if something fails, when and why, and be able to perform retries if necessary without having to do a bunch of messaging stuff to get it to work.  This also simplifies deployment down to a single app.  It uses way less (AWS) resources this way.
+
+## Rejected Concepts
+
+#### Microservice
+In general, applications should do one thing and the boundary of an application should not be expanded beyond this purview.  We have here a well defined application boundary: a system that acts as an interface between the backend api (or any listener) and internal IRS systems for the purposes of gathering relevant user data.  We could further split this into a set of applications that gather information and an application that performs the spooling.  When each DI system finishes it would raise an event, and the spooler app would gather the data from the distributed cache, merge it, and write it back.  Philosophically this works, but it adds a lot of failure points for very little gain at our current scale and set up.  In the future this may be a very useful thing to do.  We could for example house these applications near their supporting app (like an TKTK DI service that sits near the TKTK service in the on prem world and fetches) which would increase our speed and could get us faster information about those systems if we can integrate deeper with them.
+
+##### Failures in the Microservice system
+The main failure point is the problem of messaging.  All messaging across a network is subject to failures.  The more messages we have to send the more chances for these failures to occur.  We should get some benefit for the failure, like higher scale or ease of development and onboarding but with only 5 or so of these systems and with such a simple pattern we aren't going to gain anything.  Our throughput will not likely come from our own applications but rather the applications we rely on.
+
+We are also relying on our distributed cache to stay up throughout the operation.  We can handle the timeout on the requesting (backend api) app, and we can continue to count how many returns we have.  What would happen if our distributed cache failed though?  How would we know what we lost, if we lost anything?  No individual pod would (or could really without introducing a P2P communication mesh, rumor/whispers protocol, or something really out of pocket with the queues) be managing the job in the microservice system and so if there were failures they would be harder to track and propagate.  We don't yet know all of the failure cases, so it will be good to have these operations tightly managed by the spooler in the first year.
+
+
+#### Backend API does it
+The backend API is already responsible for a lot.  This is also an asynchronous process that may take a bit of time.  The complexity of the backend API is already hard enough to track, we should avoid adding discrete systems to it if for no other reason than so we can continue to reason about it.  It is 2 less potential failure points than the above suggestion, but it comes with a substantial reduction in backend api complexity, memory use, and the ability to eventually turn this into a precaching system if we desire (which would be awkward if we were standing up backend api apps for that).
+
+#### Linear Flow/Non-Spoke-Hub System
+This system has about 5 very similar operations that easily slot into a spoke-hub system.  Why fight against the obvious pattern?  If this was done linearly it might be supportable for the first year, but it would be messy.  It would also be much slower as each operation would have to be done in sequence.  That would give you an automatic precedence order, which is a nice feature.  This system design would not allow for growth in the future, good code reuse, or general supportability
+
+## Assumptions
+
+- Messaging systems occasionally lose messages: this is true, and it is underpinning at least some of the claims above.
+- Having the per tax return system checks centrally managed is desired: My belief here is that we don't know all of the error cases and it will be easier for us to see them in a single application and create tests that exercise these failure conditions.  There is a world in which that isn't necessary, but I think it has some solid side effects.
+- Standing up more pods is annoying and not worth it: we have already updated our CDRP and docs with a single service doing this.  We probably could update them but I don't see the benefit yet.  I would like to hear a solid reason why breaking this into multiple applications is worthwhile.
+- Merging is a wise idea: there is a world in which we just want to dump everything we know rather than doing a best guess merge.  My view is that inundating the backend (and maybe the client/user) with this information isn't a useful thing to do.  We want the user to check this information anyways.  We aren't making any promises about the correctness!
+- The bottleneck we will face is the other systems and not our system: If the other system is the bottle neck then it doesn't make sense to stand up a bunch of end point hitting services that just return data from that service.  If that isn't true, if we are the problem, then it makes sense to have as many services to hit that end point as we can make use of.  It could be as many as (made up number ahead) 50:1 DI services to spoolers in that case.  It is reasonable to assume that the other systems will be the problem in this flow because where would our bottleneck come from?  We will stand as many of our services up as needed but the ratio of end point hitters to spoolers will always remain constant (1:1 per type of DI service and spooler).
+
+## Constraints
+- This is for our current scale and complexity.  These trade-offs are different at higher scales.
+- This is for an assumed <20 system integrations
+
+
+## Status
+Pending
+
+## Consequences
+- We may hit a point where, like let's say 25 services where this pattern becomes cumbersome.  It doesn't make sense to have so many calls on one system.  The number of integrated systems decreases the total available load per system by some percentage. At some level we will want to split out the DI modules into microservices, but we shouldn't do that until it makes sense.
+
+
+
+
--- a/docs/adr/devx-goals.md
+++ b/docs/adr/devx-goals.md
@ -0,0 +1,47 @@
+# Top line DevX goals
+*Note: these are topline goals -- they may be contradictory and specific situations may involve trading off against these separate goals*
+
+1. Creating a working, running set up of all services should be one command, `docker compose build`. 
+1. Running each service locally should be one command, `./mvnw spring-boot:run -Dspring-boot.run.profiles=development`
+1. Running tests on each service should be one command, `./mvnw test`
+1. Basic IDE settings for intellij and VS Code are checked into the repository, and opening a service or the full `direct-file` repository in these IDEs should "just work". Tests should run in the IDE, local debuggers should work, and there should be no red squigglies or unresolved problems in the IDE.
+1. Formatters and linters (e.g. spotless and spotbugs) should standardize formatting, prevent unused variables, enforce styling, and otherwise keep the codebase clean and consistent. CI should check this formatting and linting. 
+1. Code standards are kept via CI tests and robots, not human code reviews. 
+1. We should have a build time of 5 minutes or less for our CI checks.
+1. Environment variables and scripts that require setup outside of our build scripts should be minimal. When environment variables or scripts are required, and are not set, a service should fail immediately and loudly with a helpful error message. 
+1. Making a simple change should be simple and take a short amount of time (e.g. updating a fact graph -> MeF mapping should be 1 line of change)
+1. Similarly, writing a simple test should be simple, and modifying our tests (unit or snpashot) should eb clear and easy. 
+1. Error messages that indicate a problem in environment setup should be clear and simple. 
+1. Error and warn logs should always point to a real and solveable problem. 
+1. We should not have manual code changes that are derivative of other code (e.g. updating `aliases.ts` for aliased facts).
+1. Any script that is a part of a common dev workflow should be checked + run in CI. 
+
+
+## First 3 goals
+We're doing great on the first 3 DevX goals! 
+
+## Basic IDE settings
+1. We should check in recommended VS Code settings + plugins for Java Development. 
+1. We should (?) have recommmended settings for Intellij checked in. 
+1. We should fix the bad imports on jaxb sources for the IRS MeF SDK.
+1. We should finish setting up spotbugs across projects, and resolving our existing spotbugs errors. 
+1. We should standardize our linting + formatting tools, and make them run in CI to prevent unresolved problems. 
+
+## Environment variables
+1. We should have a louder error + immediate failure for not having `LOCAL_WRAPPING_KEY` set when someone runs a 
+2. Submit and status apps should fail to start without the appropriate environment variables set, along with a message about what variables are required.
+
+## Simple changes should be simple
+(Alex lacks the most content here, and would love people to add more)
+1. We should stop checking in generated code for the frontend fact dictionary, and move that to a build step when someone starts, builds, or tests the frontend, since backend devs should not need to run `npm run build` when they modify facts. 
+1. We should stop checking in generated MeF code for the backend Mef packing transform, and move that to a build step that runs prior to the backend's `./mvnw compile` command. That command should run consistently. 
+1. We should remove all checked in intermediate formats and only have checked in writable fact graphs, and checked in XML snapshots. 
+
+## Simple tests should be simple
+We're doing a lot better here than we used to! Snapshot tests now run in CI and regenerate automagically, instead of being an ETE test. 
+
+## Error messages + logging
+1. We should look through our existing logs and check for unnecessary warn/error messages that are noisy, and remove those noisy logs (e.g. the address facts that currently spam every packing). 
+
+## Removing duplicative manual code changes
+1. We should fix `aliases.ts` to not require changes any time someone creates a `Filter` or `IndexOf` fact. 
--- a/docs/adr/internet_constraints.md
+++ b/docs/adr/internet_constraints.md
@ -0,0 +1,35 @@
+# Bandwidth Constraints
+
+Our users are going to come from all over the country, with a variety of access to broadband internet. We want to make sure that we're supporting the vast majority of users. To specify that further, we're looking to device and bandwidth data, and deciding the following constraints:
+
+## Bandwidth Constraints
+
+Source for bandwidth data is the [FCC 2021 Bandwidth Progress Report](https://docs.fcc.gov/public/attachments/FCC-21-18A1.pdf), which reports bandwidth levels through 2019. 
+
+As of 2019, 99.9% of people in the US have access to either:
+1. Mobile 4G LTE
+with a Minimum Advertised Speed of 5/1 Mbps OR
+2. Fixed Terrestrial 25/3 Mbps 
+
+including 99.7% of people in in rural areas and 97.9% of people on Tribal lands (fig 3c).
+
+Given this, we can support 99.9% of our population by supporting users with **5Mbps down/ 1Mbps up**. 
+
+## Latency Constraints
+
+The FCC does not include latency in its Bandwidth Progress Report (III.A.17), so we're left to other less official sources. However, we can set our bar at "what is the latency of 4G LTE, given that it's going to be the worst case latency for the 99.9% of people who have 4G LTE or Terrestrial 25/3 Mbps"?
+
+In this case, latency refers to the time it takes for your phone to get a response from the cell tower at the other side of its radio link. It doesn't include time from that radio link to our cloud data center, or any processing we'd do there, or the response time. But this is the measure that would put 4G on the same page as a terrestrial link. 
+
+Average 4g latency is around and lower than 50ms<sup>[1](https://www.statista.com/statistics/818205/4g-and-3g-network-latency-in-the-united-states-2017-by-provider/),[2](https://www.lightreading.com/mobile/4g-lte/lte-a-latent-problem/d/d-id/690536)</sup> while maximum latencies I can find on the internet seem to be around 100ms (though this seems hard to find information on!)<sup>[3](https://www.researchgate.net/figure/Maximum-and-average-latency-in-4G-and-3G-networks-6_fig3_338598740)</sup>
+
+Given this, we should set our **latency target at 100ms** to support the worst case of the 99.9% of people in the US whose worst connection choice is 4G LTE. 
+
+### Dev suggestion
+To assume the case of our worst case users, you can create a [network throttling profile](https://developer.chrome.com/docs/devtools/settings/throttling/) in Chrome devtools. You can use the app with that profile enabled to see the loading screens and timing experience that we're expecting of our worst case users. 
+
+#### Appendix
+
+In 2023, apparently you can take zoom calls from the side of the mountain. And in 2024, maybe TKTK will file his taxes from there, too. 
+
+<img width="695" alt="On a mountain" src="https://github-production-user-asset-6210df.s3.amazonaws.com/135663320/251536414-ce525b7c-e795-49cd-ab33-6950a54d5ec4.png?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAVCODYLSA53PQK4ZA%2F20250416%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20250416T225041Z&X-Amz-Expires=300&X-Amz-Signature=be26cddd37bb08083453688fa707babf3f8ea027eb3d66e0a672b4f4724e4e62&X-Amz-SignedHeaders=host">
--- a/docs/adr/js-backend-lambdas.md
+++ b/docs/adr/js-backend-lambdas.md
@ -0,0 +1,33 @@
+Created At: 2, Aug, 2024
+
+RFC: Introducing Javascript/Typescript to DF Backends
+
+# Problem Statement
+
+Currently, all of our backend code is written in Java [^1]. We have a couple use cases where this might not be ideal:
+1) One off production remediation code during filing season
+2) Logic that is shared between the frontend and the background
+
+[^1] Two exceptions worth noting - (1) we have simulators for development which are in python and (2) our fact graph is written in scala and runs on the JVM.
+
+# Background
+
+## Languages and Toolchains
+
+We want to minimize the number of languages and toolchains in direct file. This has the following benefits:
+- Less engineering time spent on configuring linting/ci/testing infrastrcuture
+- Easier for an individual engineer to ramp up on the end to end of a product scenario
+- Eliminates the need for handoffs between language boundaries
+
+## Shared Frontend <> Backend Logic
+
+Today we have our factgraph logic shared between the frontend and backend. We achieve this by transpiling scala into javascript. Many projects eventually want to share logic between frontends and backends and javascript is the lowest overhead approach to do this. Although we're not sure how the factgraph will evolve over time, one possibility is running in serverside via a javascript lambda.
+
+
+## One-off Remediations
+In production during filing season, we often want to run code in a production environment to fix issues. We do this using lambdas triggered s3 file notifications in production. These can often be written faster in a dynamic language like javascript/python/ruby. Going into filing season 2025 we would like to have the capability to run these type of remediation lambdas on a dynamic language.
+
+
+# Decisions
+
+Enable lambdas running the most recently available node lts in direct-file, start experimenting with appropriate usage.
--- a/docs/adr/supporting-files/adr-namespacing-11-26-23-dictionary.svg
+++ b/docs/adr/supporting-files/adr-namespacing-11-26-23-dictionary.svg
--- a/docs/adr/supporting-files/adr-namespacing-proposal.svg
+++ b/docs/adr/supporting-files/adr-namespacing-proposal.svg
@ -0,0 +1,179 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
+ "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
+<!-- Generated by graphviz version 8.1.0 (20230707.0739)
+ -->
+<!-- Pages: 1 -->
+<svg width="1605pt" height="575pt"
+ viewBox="0.00 0.00 1604.75 575.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
+<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 571)">
+<polygon fill="white" stroke="none" points="-4,4 -4,-571 1600.75,-571 1600.75,4 -4,4"/>
+<!-- filers -->
+<g id="node1" class="node">
+<title>filers</title>
+<ellipse fill="none" stroke="black" cx="534" cy="-549" rx="29.86" ry="18"/>
+<text text-anchor="middle" x="534" y="-543.95" font-family="Times,serif" font-size="14.00">filers</text>
+</g>
+<!-- deductions -->
+<g id="node2" class="node">
+<title>deductions</title>
+<ellipse fill="none" stroke="black" cx="359" cy="-195" rx="50.84" ry="18"/>
+<text text-anchor="middle" x="359" y="-189.95" font-family="Times,serif" font-size="14.00">deductions</text>
+</g>
+<!-- filers&#45;&gt;deductions -->
+<g id="edge1" class="edge">
+<title>filers&#45;&gt;deductions</title>
+<path fill="none" stroke="black" d="M504.15,-546.35C391.69,-539.73 0,-512.98 0,-461.5 0,-461.5 0,-461.5 0,-282.5 0,-222.04 194.9,-203.71 297.49,-198.26"/>
+<polygon fill="black" stroke="black" points="297.51,-201.71 307.33,-197.71 297.16,-194.72 297.51,-201.71"/>
+<text text-anchor="middle" x="144.75" y="-366.95" font-family="Times,serif" font-size="14.00">/dependentStatusBansFilerFromClaimingDependents</text>
+</g>
+<!-- filers&#45;&gt;deductions -->
+<g id="edge2" class="edge">
+<title>filers&#45;&gt;deductions</title>
+<path fill="none" stroke="black" d="M503.76,-548.59C446.07,-547.98 326,-537.57 326,-461.5 326,-461.5 326,-461.5 326,-282.5 326,-261.34 334.73,-239.06 343.21,-222.41"/>
+<polygon fill="black" stroke="black" points="346.71,-224.3 348.38,-213.84 340.55,-220.98 346.71,-224.3"/>
+<text text-anchor="middle" x="464.38" y="-366.95" font-family="Times,serif" font-size="14.00">/secondaryFilerAdditionalStandardDeductionItems</text>
+</g>
+<!-- dependents -->
+<g id="node3" class="node">
+<title>dependents</title>
+<ellipse fill="none" stroke="black" cx="968" cy="-460.5" rx="52.38" ry="18"/>
+<text text-anchor="middle" x="968" y="-455.45" font-family="Times,serif" font-size="14.00">dependents</text>
+</g>
+<!-- filers&#45;&gt;dependents -->
+<g id="edge3" class="edge">
+<title>filers&#45;&gt;dependents</title>
+<path fill="none" stroke="black" d="M563.8,-545.71C613.28,-541.54 715.34,-531.41 800,-513 842.1,-503.85 888.87,-489.12 922.35,-477.75"/>
+<polygon fill="black" stroke="black" points="923.41,-480.75 931.73,-474.19 921.14,-474.13 923.41,-480.75"/>
+<text text-anchor="middle" x="936.88" y="-499.7" font-family="Times,serif" font-size="14.00">/filersCouldntBeDependents</text>
+</g>
+<!-- filingStatus -->
+<g id="node4" class="node">
+<title>filingStatus</title>
+<ellipse fill="none" stroke="black" cx="706" cy="-372" rx="53.4" ry="18"/>
+<text text-anchor="middle" x="706" y="-366.95" font-family="Times,serif" font-size="14.00">filingStatus</text>
+</g>
+<!-- filers&#45;&gt;filingStatus -->
+<g id="edge4" class="edge">
+<title>filers&#45;&gt;filingStatus</title>
+<path fill="none" stroke="black" d="M506.78,-541.28C454.45,-527.07 348.21,-491.26 386,-442.5 423.68,-393.89 459.97,-421.36 520,-408 562.91,-398.45 611.77,-389.32 648.65,-382.78"/>
+<polygon fill="black" stroke="black" points="648.96,-386.11 658.2,-380.93 647.74,-379.21 648.96,-386.11"/>
+<text text-anchor="middle" x="449" y="-455.45" font-family="Times,serif" font-size="14.00">/eligibleForJointReturn</text>
+</g>
+<!-- filers&#45;&gt;filingStatus -->
+<g id="edge5" class="edge">
+<title>filers&#45;&gt;filingStatus</title>
+<path fill="none" stroke="black" d="M531.49,-530.76C529.17,-508.16 528.58,-468.35 548,-442.5 571.97,-410.58 613.78,-393.06 648.37,-383.6"/>
+<polygon fill="black" stroke="black" points="649.03,-386.79 657.85,-380.91 647.29,-380 649.03,-386.79"/>
+<text text-anchor="middle" x="609.5" y="-455.45" font-family="Times,serif" font-size="14.00">/isWidowedInTaxYear</text>
+</g>
+<!-- filers&#45;&gt;filingStatus -->
+<g id="edge6" class="edge">
+<title>filers&#45;&gt;filingStatus</title>
+<path fill="none" stroke="black" d="M561.07,-541.05C591.9,-531.78 641.77,-512.45 671,-478.5 689.85,-456.61 698.62,-424.24 702.66,-401.14"/>
+<polygon fill="black" stroke="black" points="706.26,-401.73 704.32,-391.32 699.34,-400.67 706.26,-401.73"/>
+<text text-anchor="middle" x="719.5" y="-455.45" font-family="Times,serif" font-size="14.00">/isMarried</text>
+</g>
+<!-- filers&#45;&gt;filingStatus -->
+<g id="edge7" class="edge">
+<title>filers&#45;&gt;filingStatus</title>
+<path fill="none" stroke="black" d="M563.99,-548.77C613,-548.18 709.09,-538.89 752,-478.5 761.27,-465.46 758.51,-457.12 752,-442.5 747.13,-431.58 739.26,-434 732,-424.5 726.35,-417.12 721.38,-408.32 717.34,-400.1"/>
+<polygon fill="black" stroke="black" points="720.12,-398.76 712.74,-391.15 713.76,-401.7 720.12,-398.76"/>
+<text text-anchor="middle" x="805.75" y="-455.45" font-family="Times,serif" font-size="14.00">/eligibleForSingle</text>
+</g>
+<!-- credits -->
+<g id="node5" class="node">
+<title>credits</title>
+<ellipse fill="none" stroke="black" cx="1048" cy="-106.5" rx="35.49" ry="18"/>
+<text text-anchor="middle" x="1048" y="-101.45" font-family="Times,serif" font-size="14.00">credits</text>
+</g>
+<!-- deductions&#45;&gt;credits -->
+<g id="edge15" class="edge">
+<title>deductions&#45;&gt;credits</title>
+<path fill="none" stroke="black" d="M406.36,-188.05C533.11,-172.14 878.89,-128.73 1002.95,-113.16"/>
+<polygon fill="black" stroke="black" points="1003.08,-116.54 1012.56,-111.82 1002.2,-109.6 1003.08,-116.54"/>
+<text text-anchor="middle" x="775.12" y="-145.7" font-family="Times,serif" font-size="14.00">/agi</text>
+</g>
+<!-- dependents&#45;&gt;filingStatus -->
+<g id="edge8" class="edge">
+<title>dependents&#45;&gt;filingStatus</title>
+<path fill="none" stroke="black" d="M925.36,-449.79C913.31,-447.23 900.17,-444.59 888,-442.5 827.8,-432.14 805.7,-453.13 751.75,-424.5 740.26,-418.4 730.28,-408.2 722.6,-398.47"/>
+<polygon fill="black" stroke="black" points="724.96,-396.78 716.21,-390.81 719.33,-400.93 724.96,-396.78"/>
+<text text-anchor="middle" x="848.12" y="-411.2" font-family="Times,serif" font-size="14.00">/dependents/*/hohQualifyingPerson</text>
+</g>
+<!-- dependents&#45;&gt;filingStatus -->
+<g id="edge9" class="edge">
+<title>dependents&#45;&gt;filingStatus</title>
+<path fill="none" stroke="black" d="M965.19,-442.47C962.36,-430.93 956.79,-416.3 946,-408 919.2,-387.38 831.57,-378.85 769.98,-375.37"/>
+<polygon fill="black" stroke="black" points="770.35,-371.83 760.18,-374.79 769.98,-378.82 770.35,-371.83"/>
+<text text-anchor="middle" x="1053.62" y="-411.2" font-family="Times,serif" font-size="14.00">/dependents/*/qssQualifyingPerson</text>
+</g>
+<!-- dependents&#45;&gt;credits -->
+<g id="edge12" class="edge">
+<title>dependents&#45;&gt;credits</title>
+<path fill="none" stroke="black" d="M1018.44,-455.36C1065.97,-450.53 1132.82,-441 1153,-424.5 1172.14,-408.86 1174,-397.72 1174,-373 1174,-373 1174,-373 1174,-194 1174,-171.9 1120.63,-141.92 1083.36,-123.69"/>
+<polygon fill="black" stroke="black" points="1085.33,-120.27 1074.8,-119.1 1082.3,-126.58 1085.33,-120.27"/>
+<text text-anchor="middle" x="1270.75" y="-278.45" font-family="Times,serif" font-size="14.00">/dependents/*/eitcQualifyingPerson</text>
+</g>
+<!-- dependents&#45;&gt;credits -->
+<g id="edge13" class="edge">
+<title>dependents&#45;&gt;credits</title>
+<path fill="none" stroke="black" d="M1020.59,-459.97C1136.55,-459.6 1404,-450.36 1404,-373 1404,-373 1404,-373 1404,-194 1404,-123.18 1160.85,-144.79 1093,-124.5 1090.41,-123.73 1087.75,-122.87 1085.09,-121.98"/>
+<polygon fill="black" stroke="black" points="1086.51,-118.41 1075.92,-118.38 1084.17,-125.01 1086.51,-118.41"/>
+<text text-anchor="middle" x="1500.38" y="-278.45" font-family="Times,serif" font-size="14.00">/dependents/*/isClaimedDependent</text>
+</g>
+<!-- filingStatus&#45;&gt;credits -->
+<g id="edge10" class="edge">
+<title>filingStatus&#45;&gt;credits</title>
+<path fill="none" stroke="black" d="M726.62,-355.11C785.38,-309.84 953.7,-180.16 1020.34,-128.81"/>
+<polygon fill="black" stroke="black" points="1022.27,-130.97 1028.05,-122.1 1017.99,-125.43 1022.27,-130.97"/>
+<text text-anchor="middle" x="917" y="-234.2" font-family="Times,serif" font-size="14.00">/filingStatus</text>
+</g>
+<!-- income -->
+<g id="node6" class="node">
+<title>income</title>
+<ellipse fill="none" stroke="black" cx="496" cy="-283.5" rx="38.56" ry="18"/>
+<text text-anchor="middle" x="496" y="-278.45" font-family="Times,serif" font-size="14.00">income</text>
+</g>
+<!-- filingStatus&#45;&gt;income -->
+<g id="edge11" class="edge">
+<title>filingStatus&#45;&gt;income</title>
+<path fill="none" stroke="black" d="M673.19,-357.48C635.6,-342 574.08,-316.66 534.32,-300.28"/>
+<polygon fill="black" stroke="black" points="535.79,-296.69 525.21,-296.12 533.13,-303.17 535.79,-296.69"/>
+<text text-anchor="middle" x="670.75" y="-322.7" font-family="Times,serif" font-size="14.00">/isFilingStatusMFJ</text>
+</g>
+<!-- finalAmount -->
+<g id="node7" class="node">
+<title>finalAmount</title>
+<ellipse fill="none" stroke="black" cx="1117" cy="-18" rx="58.52" ry="18"/>
+<text text-anchor="middle" x="1117" y="-12.95" font-family="Times,serif" font-size="14.00">finalAmount</text>
+</g>
+<!-- credits&#45;&gt;finalAmount -->
+<g id="edge16" class="edge">
+<title>credits&#45;&gt;finalAmount</title>
+<path fill="none" stroke="black" d="M1060.66,-89.63C1070.8,-76.91 1085.23,-58.82 1096.95,-44.14"/>
+<polygon fill="black" stroke="black" points="1099.98,-46.95 1103.48,-36.95 1094.51,-42.58 1099.98,-46.95"/>
+<text text-anchor="middle" x="1112.38" y="-57.2" font-family="Times,serif" font-size="14.00">/totalTax</text>
+</g>
+<!-- income&#45;&gt;deductions -->
+<g id="edge14" class="edge">
+<title>income&#45;&gt;deductions</title>
+<path fill="none" stroke="black" d="M473.68,-268.41C451.63,-254.49 417.63,-233.02 392.36,-217.07"/>
+<polygon fill="black" stroke="black" points="394.48,-213.63 384.15,-211.25 390.74,-219.55 394.48,-213.63"/>
+<text text-anchor="middle" x="481" y="-234.2" font-family="Times,serif" font-size="14.00">/taxableIncome</text>
+</g>
+<!-- estimatedPayments -->
+<g id="node8" class="node">
+<title>estimatedPayments</title>
+<ellipse fill="none" stroke="black" cx="1185" cy="-106.5" rx="83.08" ry="18"/>
+<text text-anchor="middle" x="1185" y="-101.45" font-family="Times,serif" font-size="14.00">estimatedPayments</text>
+</g>
+<!-- estimatedPayments&#45;&gt;finalAmount -->
+<g id="edge17" class="edge">
+<title>estimatedPayments&#45;&gt;finalAmount</title>
+<path fill="none" stroke="black" d="M1171.57,-88.41C1161.63,-75.77 1147.94,-58.36 1136.76,-44.13"/>
+<polygon fill="black" stroke="black" points="1139.11,-42.46 1130.18,-36.76 1133.6,-46.78 1139.11,-42.46"/>
+<text text-anchor="middle" x="1197.5" y="-57.2" font-family="Times,serif" font-size="14.00">/totalPayments</text>
+</g>
+</g>
+</svg>
--- a/docs/adr/supporting-files/culminating-facts.svg
+++ b/docs/adr/supporting-files/culminating-facts.svg
--- a/docs/adr/tax-logic-testing-strategy.md
+++ b/docs/adr/tax-logic-testing-strategy.md
@ -0,0 +1,88 @@
+# Tax Logic Testing Strategy 
+12/15/23
+
+## Problem Statement
+The goal of Tax Logic Testing is to test:
+1. Do we show users the correct screens and collect the correct information from each user based on their previous answers? This document refers to this problem as _completeness_ for whether the user has completed the correct screens and input a complete set of data. (Flow + fact dictionary interactions)
+1. Do we correctly compute filing status, dependency, deduction, and credit eligibility, as well as the final tax amount calculations? This document refers to this problem as _correctness_ for whether our derived facts have the appropriate calculations.  (Fact dictionary interactions)
+1. Are we resilient to review + edit conditions (telling a user to resume where they need to resume and using only valid data)? (Flow + fact dictionary interactions)
+
+We want to make sure that we are testing each of these criteria, automatically, on every pull request. We have already built most test infrastructure that we need for this. For our next steps, as we move closer to the pilot, we need to define and increase our test coverage for each of these problems. 
+
+
+## Where this proposal falls into the overall test framework
+
+These tests, and this document, cover a lot of tax logic, but these are three areas outside of our scope:
+
+### Scenario tests
+The testing team will perform "scenario testing," which is an integration test that applies scenarios and checks that the expected tax burden is correct. This test type is a black box test, where it only registers a `PASS` or `FAIL` (with expected and observed values). This test's goal is to run through as many scenarios as we can very quickly to identify regressions. 
+
+### Manual testing with subject matter experts
+We're relying on the testing team to go through a greater level of scenario testing + manual testing and file bugs on us when they find issues. We should hopefully be able to identify where we were missing a test, write a failing test case, and then fix the issue. SMEs and tax policy experts doing manual testing is how we might plan to find issues in _our understanding_ of the tax code. 
+
+### Testing after a user submits their data
+This testing goes from the user through to the fact graph. We are testing that we generate the correct derived facts. This testing does not include MeF integration -- the backend team is responsible for translating our fact graph representation into MeF XML (though we will gladly help with facts!)
+
+## Methods
+Based on the recent modularization of the fact dictionary, we've defined that ~140 of the ~800 facts are used in multiple sections of the app -- these facts, which we can call **culminating facts** are the ones that determine a tax test, eligibility, or computation, and they should be selected for additional test coverage. We base our test strategies around proving the completeness and the correctness of each culminating fact. 
+
+The vast majority of our bugs during development have happened from these culminating facts either having been incomplete (e.g. you went through the dependents section, but we still didn't know if that dependent qualified for Child Tax Credit), or incorrect (e.g. The taxpayer was born January 1, and we calculated their Standard Deduction wrong). 
+
+This graph shows the current state of 143 culminating facts connecting our 21 independent modules:
+
+![Culminating Facts!](./supporting-files/culminating-facts.svg)
+
+This is the graph that we must test. We have four generalized types of tests, each of which have their own purpose in our testing strategy. All tests run on pre-merge, prior to every commit and deploy of our codebase. 
+
+### 1. Testing for Completeness
+Historically, this has caused our largest area of bugs, but we have recently made strides by being able to test that a section will always complete certain facts. For this, we use our `flowSetsFacts` tests. This tests a combination of both our flow and fact dictionary. Existing tests live in ../direct-file/df-client/df-client-app/src/test/completenessTests
+
+These tests run via a simulator that follows all potential paths in the flow for each potentially consequential answer to a question. Given this, the tests are computationally expensive to run, but have a low margin for manual error compared to our other forms of testing. We additionally have more work to do until we can run this on the largest sections of the app (spouse and dependents)
+
+#### Test Examples
+* At the end of the spouse section, do we know whether the taxpayer (and a potential spouse) need to be treated as dependent taxpayers? 
+* At the end of a dependents collection item, do we know is a person qualifies as a dependent, or a qualifying person for each filing status + credit?
+* At the end of the W2 section, do we know the employer wage income of a person, as well as whether they have combat pay?
+
+#### Coverage
+We need to begin measuring our coverage of each of the culminating facts. Every culminating fact should be known to be complete by the end of a section in the app. 
+
+### 2. Testing for Correctness
+Each of our culminating facts should represent a concept in the tax code (e.g. a rule, or eligibility, or a computation), or an important fact on the way there. We write vitest facts that generate a fact graph, and test that our fact dictionary rules represent our understandings. Existing tests live in ../df-client/df-client-app/src/test/factDictionaryTests. This tests just the fact dictionary. 
+
+This requires a potentially high volume of tests to be written for each culminating fact, and also may require taking other culiminated facts as input. When a tax bug is reported, we should generally be able to write a fact dictionary test that would prevent the bug from occurring again. 
+
+#### Examples
+* After a certain income amount, the student loan adjustment will begin to phase out. 
+* A taxpayer will receive an additional amount of standard deduction if they are over 65. 
+* A person is ineligible for Qualifying Surviving Spouse status is they were ineligible to file jointly the year of their spouse's death. 
+* A child without an ssn can qualify a taxpayer for EITC, but is not used in the computation.
+
+#### Coverage
+We need to measure how many of our culminating facts have unit test coverage.
+
+### 3. Functional Flow Navigation
+After each screen, we must use the current fact graph state to choose the screen the user will see next.  The screens the user sees affects the data that is collected. To test that the user sees the right screen based on their answers, we have written functional flow tests. For each screen, we test a starting state, being on the screen, setting a fact, and testing what screen the app will move to next. Existing tests live in ../direct-file/df-client/df-client-app/src/test/functionalFlowTests. 
+
+Creating these tests can be prone to manual error since it requires a developer to reason about the potential next screens after any screen. Combined with our completeness tests above, this checks that the user has input the correct data. This test a combination of the fact dictionary and flow. 
+
+#### Examples
+* A taxpayer who says they were a US Citizen all year will move to being asked about their state residency. If they were not a taxpayer all year, they will be asked if they were a taxpayer at the end of the year. 
+* A taxpayer who notes that they lived in a single, in-scope state will go on to next enter their TIN. A taxpayer who says they lived in multiple states will move to a knockout screen.
+
+#### Coverage
+We need to track what percentage of screens we have as a starting point in a functional flow test, and what percentage of screens we navigate to in a functional flow test. 
+
+### 4. Static Analysis
+We use static analysis (mostly TypeScript type safety) to ensure that we are reading facts that exist and writing to facts that are well defined. Similarly, we use static analysis to ensure that derived facts are only using culminating facts from other sections, and never internal facts that haven't been meant for consumption. 
+
+We can build additional static analysis that may help us with our robustness to edits. 
+
+#### Test Examples (existing)
+* The client reads `/isEligibleForEitc` -- that is a fact defined in the fact dictionary. 
+* We write a boolean value to `/livedWithTaxPayerAllYear` -- that is a boolean value in the fact dictionary
+* EITC relies on a dependent having qualified as an EITC qualifying child in the dependents section. We check that "/isEitcQualifyingChild" is marked as exported from the dependents module. 
+* A fact is used by MeF. We check that that fact exists in the fact dictionary. 
+
+#### Coverage
+Static analysis operates on the full codebase and does not have a coverage metric.