SF-SAC intake validations
Intake validations are run as an audit is submitted piece-by-piece. In order to validate SF-SAC form sections, we do three things:
- Convert the spreadsheet into JSON
- Validate the data in this intermediate JSON form
- Convert the data to a meaningful semantic representation
Each of these steps helps ensure a better submission to the FAC.
Reporting errors
Intake validations are critical for several reasons.
- They are user-centered.
When a validation finds a mistake in the data, we try and produce error messages that are human-friendly. They are what help auditors and grantees fix errors they may have made. Our messages specify errors down to the specific cell, and are refined for usability based on user feedback. - They are self-contained.
Each validation does only one thing. This helps the GSA team with maintaining the validations. - They are comprehensive.
The intake validations are intended to cover all possible error conditions we might encounter. In the event that an auditor or grantee finds a new, surprising way to enter data into our forms... we add a new intake validation.
Ultimately, our goal is for the intake validations to provide a good user experience when part of the SF-SAC is completed incorrectly while simultaneously ensuring that we colelct quality data via the form.
Step 1: Convert the spreadsheet into JSON
Spreadsheets are first converted into JSON, or Javascript Object Notation. Internally, this is referred to as the intermediate representation. This intermediate representation is a small language that uses named ranges to define regions of data. We describe this representation more fully as part of our intakelib documentation.
This conversion is a validation of its own kind; if we cannot convert the input to our intermediate representation, it is obviously not an SF-SAC upload! The SF-SAC forms must be saved in an Excel 2007 XML-based format (or xlsx
). We chose this because many current spreadsheet manipulation tools (e.g. LibreOffice) are capable of saving data in this format. This means that it is possible to prepare an SF-SAC submission using free and open source software.
Step 2: Validate the data
The intermediate form makes the data validation easier. Every section of the SF-SAC has a common set of validations applied. For example, we make sure every form section has a UEI defined. Then, we apply section-specific validations.
Form section | Validation | Description |
---|---|---|
ALL | ||
uei_exists | Checks that the UEI is present | |
look_for_empty_rows | Alerts to empty rows in the middle of a form | |
start_and_end_rows_of_all_columns_are_same | Makes sure all columns are the same length | |
is_right_workbook | Checks that the workbook matches the section being submitted to | |
has_all_the_named_ranges | Asserts all input ranges for a section are present | |
has_all_required_fields | Checks that all required fields are present in a given workbook | |
check_for_gsa_migration_keyword | Makes sure GSA_MIGRATION only appears in the context of the data migration | |
Federal Awards | ||
y_or_n__fields | Checks all Y/N fields are correct | |
federal_award_amount_passed_through_optional | Checks there is a subrecipient amount present if "Y" selected for "Passed Through" | |
check_cluster_names | Checks the correctness of cluster names | |
state_cluster_names | If STATE_CLUSTER is indicated, asserts they provided a state cluster name | |
other_cluster_names | If OTHER_CLUSTER is indicated, asserts they provided some other cluster name | |
passthrough_name_when_no_direct | Asserts correct behavior of passthrough name/ID when funding is direct/nidirect | |
loan_balance_present | If a loan is guaranteed, checks that a balance is provided | |
loan_balance_entry_is_valid | Checks that a loan balance is a number, empty string, or N/A | |
no_major_program_no_type | Checks presence of report type depending on major/minor declaration | |
all_unique_award_numbers | Checks that no award numbers are duplicated in a submission | |
sequential_award_numbers | Asserts AWARD-#### numbers are present and sequential | |
aln_agency_prefix | Checks for valid ALN prefixes | |
aln_three_digit_extension | Checks for valid ALN extensions | |
additional_award_identification | If RD or U is indicated in the ALN, enforces additional award information | |
federal_program_total_is_correct | Checks sums of federal programs against values submitted | |
cluster_total_is_correct | Checks sums of cluster totals | |
total_amount_expended_is_correct | Asserts that the total amount expended on the coversheet matches the sum of the values reported | |
cardinality_of_passthrough_names_and_ids | Checks that passthrough names and IDs are of the same cardinality | |
Notes to SEFA | ||
y_or_n__fields | Checks all Y/N fields are correct | |
award_references_pattern | Asserts award reference numbers are correctly formatted | |
finding_prior_references_pattern | Asserts prior award numbers are correctly formatted | |
finding_reference_pattern | Asserts the finding reference numbers are correctly formatted | |
no_repeat_findings | Checks for prior references if they are indicated | |
findings_grid_validation | Enforces correct possible values for findings Y/N grid | |
Findings Text | ||
y_or_n__fields | Checks all Y/N fields are correct | |
Corrective Action Plan Text | ||
y_or_n__fields | Checks all Y/N fields are correct |
Step 3: Convert the data to a meaningful form
After intake validations are applied, we then transform the data into its storage form. This is another JSON representation, but it is more meaningful. Instead of representing the spreadsheet, it represents the meaningful connections between portions of the data. This form is then further validated by our schema validations.