FreeBMDFreeBMD Database Tables - Districts & Aliases

Welcome to the FreeBMD District Aliasers' reference page that describes the some of the database tables that are relevant to trying to understand the Perl code that deals with creating and using the district aliases. This page contains my own observations based on reading the source code and clarified in places by emailing Dave Mayall. The responsibility for any mistakes is mine alone.

Which tables are important to understand district aliasing?

Table Name
Relevant Fields and Definitions (for district aliasing)
Districts
DistrictNumber INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY(DistrictNumber)
DistrictName VARCHAR(50) NOT NULL UNIQUE(DistrictName)
Invented TINYINT NOT NULL
YearStart SMALLINT
QuarterStart TINYINT
several others (not affecting the construction of district map)
 
DistrictSynonyms
DistrictNumber INT UNSIGNED NOT NULL  
DistrictName VARCHAR(50) NOT NULL PRIMARY KEY(DistrictName,Volume,YearStart,QuarterStart)
Volume CHAR(3) NOT NULL
YearStart SMALLINT NOT NULL
QuarterStart TINYINT NOT NULL
Misspelt TINYINT NOT NULL
YearEnd SMALLINT
QuarterEnd TINYINT
 
DistrictToCounty
DistrictNumber INT UNSIGNED NOT NULL   PRIMARY KEY(DistrictNumber,County)
County CHAR(3) NOT NULL Chapman county code
Submissions
District VARCHAR( 50 ) NOT NULL INDEX (District)  
DistrictNumber INT UNSIGNED NOT NULL INDEX (DistrictNumber)  
Surname VARCHAR( 50 ) NOT NULL INDEX (Surname(8))  
Volume VARCHAR( 10 ) NOT NULL    
RomanVolume VARCHAR(3) NOT NULL INDEX (RomanVolume)  
Page VARCHAR( 10 ) NOT NULL INDEX (Page)  
AccessionNumber INTEGER NOT NULL INDEX (AccessionNumber) PRIMARY KEY (AccessionNumber,SequenceNumber)
SequenceNumber INTEGER NOT NULL  
several others (not affecting the construction of district map)    

As the database is built when MakeDB.pl is run, for each entry

  1. after removing leading and trailing spaces and any ? characters from the district it is checked against the in memory representation of the DistrictSynonyms table
  2. if it is not in the in memory representation
    1. it is inserted into the tables Districts and DistrictSynonyms using the full spelling of the district with any leading spaces removed
    2. the stripped version is inserted into the in memory representation
  3. the entry is inserted into the table Submissions with the district spelling totally unchanged and the district_id added in the DistrictNumber field

Key to district aliasing

Within the MakeDB.pl phase of building the database the function bldDistricts builds an in memory representation of the DistrictSynonyms table, converting all characters to lower case. This is a 'go-faster stripe' mechanism to reduce db accesses, and allows all matching to be done against an array rather than doing lookups in the database.

It is called in 2 places, first at the start to load the known aliases that have already been constructed into the table DistrictSynonyms within the earlier AddDistrictMap.pl phase of building the database into the array, and again every time a new unknown is found as data is read from the transcribers files and inserted into Submissions.

It is used to determine if the district field on the entry that is about to be inserted into the table Submissions is a known district or an unknown district, and to return the district_id. These 2 fields are used to link the entries corresponding to the same line in a transcribed file as held in the tables described above together.

What executables use district map related tables and the array?

The table below is an attempt to show the usage made of all database tables and the in-memory array that is a copy of DistrictSynonyms, along with the variables through which the district field as transcribed is processed as it is converted to a number and optionally added into the database tables and in-memory array.

The main file that is called to build a new database is Makefile. Each row in the table below is in the order it is performed and the line numbers are as they were at the time of writing and may no longer correspond to the live software.

Executable Line Number Function / Instruction Data Operation
(e.g. SQL Insert / Select / Update / Delete)
AddDistrictMap.pl 23-27 opendir ...
foreach my $file ...
next ...
open(MAP,"$DistDir/$file") ...
while() {
Set up nested loops and necessary file connections to read each line from each [A-Z].txt file in turn read [A-Z].txt files
AddDistrictMap.pl 30-31 next if /^#/ || /^\s*$/;
chomp;
my @districts=split /\|/
Each non-comment or blank line from each data/districtmap/[A-Z].txt type of file is read in turn from the file into an array @districts with the | character used to separate the elements of the line into consecutive elements in the array. (One line is read and processed before moving on to the next line, and then to the next file until all lines in all the [A-Z].txt files have been read and processed) create @districts
AddDistrictMap.pl 34 my @distdata=split /\+/,$districts[0]; Read the first element in the array districts (i.e. first field for the registration district from the A.txt type of file) into an array @distdata with the + character used to separate the elements of this field into the consecutive elements in the array create @distdata
AddDistrictMap.pl 35 $districts[0]=$distdata[0]; Replace the first element in the array districts with just the district name. This will represent the standard spelling. update @districts[0]
AddDistrictMap.pl 36-54 see file More processing of @distdata Finding number corresponding to primary county associated with this registration district
AddDistrictMap.pl 69 INSERT INTO Districts Create an entry in the table Districts with the standard spelling in the DistrictName field and the associated information from @distdata in the appropriate other fields create DistrictNumber field corresponding to the DistrictName from @districts[0]
AddDistrictMap.pl 74-84 INSERT INTO DistrictToCounty Create an entry in the table DistrictToCounty using the @distdata part of the information about the registration district insert single entry into table DistrictToCounty for this district
AddDistrictMap.pl 86-113 INSERT INTO DistrictSynonyms For districts held in array elements districts[0] onwards, i.e. the standard spelling and every alias, process the ! and % information if present, strip it from the spelling. If a ! was present, set Misspelt to 1. If it was not present set Misspelt to 0. insert DistrictNumber, DistrictName, Misspelt, Volume and others into the table DistrictSynonyms
MakeDB.pl 57-61 SELECT FROM DistrictSynonyms and call bldDistricts Create array containing all standard districts and their aliases as held in DistrictSynonyms at the end of AddDistrictMap.pl i.e. one entry for each district named in every [A-Z].txt file create in-memory copy of DistrictSynonyms table
MakeDB.pl 65-67 DELETE FROM Submissions table Submissions emptied delete
MakeDB.pl 83-100   Set up nested loops to read all the files submitter by each transcriber  
MakeDB.pl 124 InsertContent called
273-309
Data from a complete transcribed file insert and update in various places
InsertContent 288 FindDistrict called district_id added to data line from transcribed file insert
FindDistrict 189-237 4 parameters used
district, year, quarter, volume
Returns the unique district_id For known districts, read
For unknown districts, create
FindDistrict 195-198 value assigned to lcd Create lcd by
converting all characters to lower case
remove any leading and training spaces
removing all ? characters
create
FindDistrict 199 assign value to district remove any leading spaces update
FindDistrict 201-217 return district_id if known for a known district the corresponding district_id is read and returned read district_id for 'known' district
FindDistrict 201-217 return district_id if known for a known district the corresponding district_id is read and returned read district_id for 'known' district
FindDistrict 222-224 Insert new district into the Districts table Insert the new TWYS with leading spaces removed spelling of the district into the Districts table insert into Districts
FindDistrict 225-227 Insert new district into the DistrictSynonyms table Insert the new TWYS with leading spaces removed spelling of the district into the Districts table insert into DistrictSynonyms
FindDistrict 228-229 assign district_id Get district_id for the new spelling create district_id for 'unknown' district
FindDistrict 230 bldDistricts called Update the in-memory copy of DistrictSynonyms table with the lcd spelling of the new district update array
InsertContent 291-301 INSERT INTO Submissions Copy of data line, including the TWYS spelling of district column from transcribed file, with district_id added to table Submissions insert

FreeBMD Main Page


Search engine, layout and database Copyright © 1998-2025 Free UK Genealogy CIO, a charity registered in England and Wales, Number 1167484.
We make no warranty whatsoever as to the accuracy or completeness of the FreeBMD data.
Use of the FreeBMD website is conditional upon acceptance of the Terms and Conditions


Explore FreeBMD