| Title: | Parse Farm Credit Administration Call Report Data into Tidy Data Frames |
|---|---|
| Description: | Parses financial condition and performance data (Call Reports) for institutions in the United States Farm Credit System. Contains functions for downloading files from the Farm Credit Administration (FCA) Call Report archive website and reading the files into tidy data frame format. The archive website can be found at <https://www.fca.gov/bank-oversight/call-report-data-for-download>. |
| Authors: | Michael Thomas [aut, cre], Ivan Millanes [aut], Ketchbrook Analytics [cph, fnd] |
| Maintainer: | Michael Thomas <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.6 |
| Built: | 2026-06-01 21:26:41 UTC |
| Source: | https://github.com/ketchbrookanalytics/fcall |
compare_files_content() reads the content of a specified file from two
folders and compares the content using the waldo::compare function. It
identifies any differences in the content and returns the comparison results.
compare_files_content(filename, dir1, dir2)compare_files_content(filename, dir1, dir2)
filename |
A character string specifying the name of the file to compare. |
dir1 |
A character string specifying the path to the first folder. |
dir2 |
A character string specifying the path to the second folder. |
compare_files_content() reads the content of the specified file
from both folders using readLines(), and compares the content using the
waldo::compare() function.
A list containing information about differences in the content of the specified file.
compare_metadata() compares the content of the metadata files (files that
start with "D_") between two specified folders containing FCA Call Report
data (from two different quarters).
compare_metadata(dir1, dir2)compare_metadata(dir1, dir2)
dir1 |
(String) The path to a folder containing FCA Call Report .TXT files for a single quarter |
dir2 |
(String) The path to a folder containing FCA Call Report .TXT files for a (different) single quarter |
compare_metadata() lists metadata files in each folder, identifies shared
metadata files, and then compares (a) the number of files, (b) file names,
(c) file order, and (d) file content (using the waldo::compare() function).
A list containing information about differences in file names, file order,
and content differences between the metadata files in dir1 and dir2
# Download data from September 2025 path_1 <- tempfile("fcadata1") dir.create(path_1) download_data( year = 2025, month = 9, dest = path_1 ) # Download data from September 2011 path_2 <- tempfile("fcadata2") dir.create(path_2) download_data( year = 2011, month = 9, dest = path_2 ) compare_metadata(path_1, path_2)# Download data from September 2025 path_1 <- tempfile("fcadata1") dir.create(path_1) download_data( year = 2025, month = 9, dest = path_1 ) # Download data from September 2011 path_2 <- tempfile("fcadata2") dir.create(path_2) download_data( year = 2011, month = 9, dest = path_2 ) compare_metadata(path_1, path_2)
Download FCA Call Report Data and Unzip
download_data(year, month, dest, files = NULL, quiet = FALSE)download_data(year, month, dest, files = NULL, quiet = FALSE)
year |
(Integer) The year of the Call Report (e.g., |
month |
(String) The month of the Call Report (e.g., |
dest |
(String) The path to the directory where the data will be downloaded (and unzipped) into |
files |
(Optional) Character vector, representing the names of the files
in the .zip archive to be downloaded; default is |
quiet |
(Optional) Logical. Controls whether download progress messages
are displayed in the console. Defaults to |
FCA publishes Call Report data quarterly. These .zip files are
typically named "YYYYMarch.zip", "YYYYJune.zip", "YYYYSeptember.zip"
and "YYYYDecember.zip" (where YYYY represents the 4-digit year).
Therefore, valid values to the month argument should be limited to
c(3, 6, 9, 12), unless there is an anomaly in FCA's reporting/publishing.
Check https://www.fca.gov/bank-oversight/call-report-data-for-download to
ensure the data is available for the quarter you are interested in.
Ketchbrook Analytics downloads these files and stores them in a public AWS
S3 bucket, which is the location that download_data() retrieves them
from.
Console message informing the user where the data was successfully downloaded (and unzipped) into
path_1 <- tempfile("fcadata1") dir.create(path_1) download_data( year = 2025, month = "September", # using the name of the month dest = path_1 ) list.files(path_1) path_2 <- tempfile("fcadata2") dir.create(path_2) download_data( year = 2025, month = 9, # using the month number (to refer to September) dest = path_2, # only download the following files files = c( "D_INST.TXT", "INST_Q202509_G20251112.TXT" ) ) list.files(path_2)path_1 <- tempfile("fcadata1") dir.create(path_1) download_data( year = 2025, month = "September", # using the name of the month dest = path_1 ) list.files(path_1) path_2 <- tempfile("fcadata2") dir.create(path_2) download_data( year = 2025, month = 9, # using the month number (to refer to September) dest = path_2, # only download the following files files = c( "D_INST.TXT", "INST_Q202509_G20251112.TXT" ) ) list.files(path_2)
Descriptions for data files
file_metadatafile_metadata
file_metadataA data frame with 36 rows and 2 columns:
Data file root name
Short description of data file
Metadata files headers
get_codes_dict() searches for an internal .rda file in the specified package
and retrieves the codes dictionary based on the provided data name and naming
convention. The naming convention is assumed to include the data name followed
by a double underscore "__".
get_codes_dict(data_name)get_codes_dict(data_name)
data_name |
A character string specifying the data name to retrieve the codes dictionary for. |
get_codes_dict() uses the provided data name to construct the expected naming
convention and searches for an internal .rda file in the specified package.
If found, it attempts to retrieve the codes dictionary using get and returns
it; otherwise, it returns NULL.
A list with the codes dictionary (codes_dict) and the associated
variable name (codes_varname) if found, otherwise each element will be NULL.
rcb_dict <- get_codes_dict("RCB") # Access codes dictionary rcb_dict$codes_dict # Access the name of the variable that stores the codes rcb_dict$codes_varnamercb_dict <- get_codes_dict("RCB") # Access codes dictionary rcb_dict$codes_dict # Access the name of the variable that stores the codes rcb_dict$codes_varname
process_data() reads the downloaded (and unzipped) .TXT files into tidy
data frames, applying the schema from the "D_" files to the corresponding raw
comma-separated data files, as well as storing the metadata from the "D_"
files
process_data(dir)process_data(dir)
dir |
(String) The path to a folder containing FCA Call Report .TXT files for a single quarter |
process_data() assumes that metadata and data files share a common root
name (characters until the first underscore occurrence).
A list containing processed data and metadata.
path <- tempfile("fcadata") dir.create(path) download_data( year = 2025, month = "September", dest = path ) processed_data <- process_data(path) # Access "RCB" data processed_data$data$RCB # Access "RCB" metadata processed_data$metadata$RCBpath <- tempfile("fcadata") dir.create(path) download_data( year = 2025, month = "September", dest = path ) processed_data <- process_data(path) # Access "RCB" data processed_data$data$RCB # Access "RCB" metadata processed_data$metadata$RCB
process_data_file() reads a data file, applies the provided metadata and codes dictionary,
and organizes the data into a tidy format. The column names are determined based on
the metadata scenario (e.g., "single", "single_multiple", "single_multiple_single").
process_data_file(file, metadata, dict = NULL)process_data_file(file, metadata, dict = NULL)
file |
(String) The path to the data file |
metadata |
A list containing the scenario and variable information
obtained from the metadata file using |
dict |
(Optional) A data frame containing codes dictionary information |
process_data_file() processes the data file according to the metadata scenario.
It handles cases where variables have multiple occurrences and organizes the data
into a tidy format with appropriate column names. The function relies on the
read_data_file function for the actual data reading.
A tibble containing the processed data in a tidy format
path <- tempfile("fcadata") dir.create(path) download_data( year = 2025, month = "September", dest = path ) process_data_file( file = file.path(path, "RCB_Q202509_G20251112.TXT"), metadata = process_metadata_file(file.path(path, "D_RCB.TXT")), dict = RCB__INV_CODE )path <- tempfile("fcadata") dir.create(path) download_data( year = 2025, month = "September", dest = path ) process_data_file( file = file.path(path, "RCB_Q202509_G20251112.TXT"), metadata = process_metadata_file(file.path(path, "D_RCB.TXT")), dict = RCB__INV_CODE )
process_metadata_file() reads a metadata file and extracts information
about the column names, column types, decimal positions, and variable
definitions.
process_metadata_file(file)process_metadata_file(file)
file |
(String) The path to the metadata file. |
process_metadata_file() processes metadata files following specific rules
to handle encoding, remove unnecessary information, and extract variable
details. It detects the scenario based on the occurrence of double asterisks
in variable names.
A list containing the scenario (e.g., "single", "single_multiple",
"single_multiple_single") and a tibble with variable information.
path <- tempfile("fcadata") dir.create(path) download_data( year = 2025, month = "September", dest = path ) process_metadata_file(file.path(path, "D_RC1.TXT"))path <- tempfile("fcadata") dir.create(path) download_data( year = 2025, month = "September", dest = path ) process_metadata_file(file.path(path, "D_RC1.TXT"))
Dictionary for INV_CODE variable of RCB file
RCB__INV_CODERCB__INV_CODE
RCB__INV_CODEA data frame with 35 rows and 2 columns:
Integer code
Character description
D_RCB.TXT
Dictionary for AssetCodeRCB2 variable of RCB2 file
RCB2__AssetCodeRCB2RCB2__AssetCodeRCB2
RCB2__AssetCodeRCB2A data frame with 17 rows and 2 columns:
Integer code
Character description
D_RCB2.TXT
Dictionary for DebtMaturityCode variable of RCB3 file
RCB3__DebtMaturityCodeRCB3__DebtMaturityCode
RCB3__DebtMaturityCodeA data frame with 10 rows and 2 columns:
Integer code
Character description
D_RCB3.TXT
Dictionary for LOANSTATUS variable of RCF file
RCF__LOANSTATUSRCF__LOANSTATUS
RCF__LOANSTATUSA data frame with 6 rows and 2 columns:
Integer code
Character description
D_RCF.TXT
Dictionary for LOANSTATUS variable of RCF1 file
RCF1__LOANSTATUSRCF1__LOANSTATUS
RCF1__LOANSTATUSA data frame with 13 rows and 2 columns:
Integer code
Character description
D_RCF1.TXT
Dictionary for DerivCode variable of RCI2B file
RCI2B__DerivCodeRCI2B__DerivCode
RCI2B__DerivCodeA data frame with 23 rows and 2 columns:
Integer code
Character description
D_RCI2B_2018.TXT
Dictionary for ExposureCode variable of RCI2C file
RCI2C__ExposureCodeRCI2C__ExposureCode
RCI2C__ExposureCodeA data frame with 12 rows and 2 columns:
Integer code
Character description
D_RCI2C_2018.TXT
Dictionary for DerivRMCode variable of RCI2D file
RCI2D__DerivRMCodeRCI2D__DerivRMCode
RCI2D__DerivRMCodeA data frame with 11 rows and 2 columns:
Integer code
Character description
D_RCI2D_2018.TXT
Dictionary for ASSET_CODE variable of RCO file
RCO__ASSET_CODERCO__ASSET_CODE
RCO__ASSET_CODEA data frame with 12 rows and 2 columns:
Integer code
Character description
D_RCO.TXT
Dictionary for RegCapCode variable of RCR3 file
RCR3__RegCapCodeRCR3__RegCapCode
RCR3__RegCapCodeA data frame with 15 rows and 2 columns:
Integer code
Character description
D_RCR3.TXT
Dictionary for RegCapCode variable of RCR7 file
RCR7__RegCapCodeRCR7__RegCapCode
RCR7__RegCapCodeA data frame with 29 rows and 2 columns:
Integer code
Character description
D_RCR7.TXT
read_data_file() reads a data file and processes it based on the provided metadata
and codes dictionary. The processing depends on the metadata scenario, which
includes cases like "single", "single_multiple", and "single_multiple_single".
For certain scenarios, the function utilizes read.csv to infer column
types without explicit specification.
read_data_file(file, metadata, dict)read_data_file(file, metadata, dict)
file |
A character string specifying the path to the data file. |
metadata |
A list containing the scenario and variable information obtained
from the metadata file using |
dict |
A data frame containing codes dictionary information. |
read_data_file() reads the data file and applies necessary processing based
on the metadata scenario. For scenarios like "single" and "single_multiple", it
uses read.csv for convenient type inference. For "single_multiple_single",
it reads the file line by line, collapses every (N_CODES + 2) lines, and then reads
the collapsed lines using read.table.
A tibble containing the processed data.
Dictionary for RegCapCode variable of RID file
RID__CAP_CODERID__CAP_CODE
RID__CAP_CODEA data frame with 12 rows and 2 columns:
Integer code
Character description
D_RID.TXT
Dictionary for ACLCode variable of RIE1 file
RIE1__ACLCodeRIE1__ACLCode
RIE1__ACLCodeA data frame with 7 rows and 2 columns:
Integer code
Character description
D_RIE1.TXT