Political piece is here.
precinctid | activeaverage | inactiveaverage | diffaverage
------------+---------------+-----------------+-------------
268 | 53.6 | 36.1 | 17.5
203 | 52.9 | 36.4 | 16.5
139 | 45.9 | 30.6 | 15.4
202 | 60.1 | 45.3 | 14.8
167 | 53.9 | 39.2 | 14.7
103 | 58.9 | 44.5 | 14.4
205 | 50.7 | 36.4 | 14.3
201 | 52.5 | 39.2 | 13.3
144 | 49.6 | 36.6 | 13.1
131 | 52.2 | 39.5 | 12.7
....
There are several new Postgres moves here for me. At this point, I expect my joins to become more sophisticated in the future. Postgres syntax:
avg(age::int4) OVER (partition by precinctid)
allows a statistical slice of a factor similar to xtabs (cross tabulation) in R. Quite frankly, I think R does this with greater fluidity and less code.
R, Julia, SQL, Octave and others: Personal notes on data analysis, computation, data access most especially for querying voter history, Census, PDC, and other election data. Reader is advised to just paste the code text into Notepad++.
Thursday, September 25, 2014
Tuesday, September 23, 2014
SQL for comparing Active vs Inactive voters over time
/* 6:20 PM 9/23/2014 -RMF Political piece for this code is here.
I create query to dump out the various (voter history) databases I wish to compare to Postgres 'views'. A 'view' in Postgres is essentially and 'in memory' table. There's more to than that but...:
Select * from voterdb where statuscode = 'I'; -- 'I' for 'inactive'
The unique voter registration number isn't quite reliable as a primary key over time (in my humble opinion) so I use a 'unique tuple' (ARRAY[lastname,firstname,middlename]) of my own invention.
Something like ARRAY[lastname,firstname,middlename,registrationnumber::TEXT] would be even more unique. You also need the 119 precincts of LD 42.
*/
I create query to dump out the various (voter history) databases I wish to compare to Postgres 'views'. A 'view' in Postgres is essentially and 'in memory' table. There's more to than that but...:
Select * from voterdb where statuscode = 'I'; -- 'I' for 'inactive'
The unique voter registration number isn't quite reliable as a primary key over time (in my humble opinion) so I use a 'unique tuple' (ARRAY[lastname,firstname,middlename]) of my own invention.
Something like ARRAY[lastname,firstname,middlename,registrationnumber::TEXT] would be even more unique. You also need the 119 precincts of LD 42.
*/
Monday, September 15, 2014
SQL to query active vs. inactive voters, primary vs. general election participation
Postgres SQL code to query a Whatcom County Voter database to comapare previous and current elections for active vs. inactive voters, precincts vs. general election participation via ballotcounted and precinctid fields. Political piece is here.