Last updated: April 2026

PCG builds custom data collection and management systems for organizations at every scale, from small businesses replacing spreadsheets with a proper database to mid-size operations moving to SQL Server, to complex multi-source data warehouses. The right solution depends on how much data is generated, how many people need to access it simultaneously, what reporting and analysis it needs to support, and what security and compliance requirements apply. PCG has been building these systems since 1995.¹

What questions should you answer before choosing a data collection solution?

The answers to these eight questions determine which data collection and management architecture fits your operation. PCG asks all of them at the start of every engagement. Getting the answers wrong at the scoping stage produces a system that fits the description but not the actual data problem.

Is the data static or will it be updated continuously over time?

Is information gathered from one source or multiple systems?

How much data will be stored now, and in three years?

Will the entire dataset be evaluated, or only specific segments?

Do records share common fields that risk duplication across sources?

What confidentiality, privacy, and regulatory compliance requirements apply?

How many users need simultaneous access, and from which locations?

What reports, dashboards, or analytical outputs does the system need to produce?

Which data collection and management scale is right for your operation?

There is no universal answer. A small business replacing a spreadsheet and a mid-size manufacturer tracking production across three facilities have fundamentally different requirements for data volume, concurrent users, transaction logging, and query performance. The table below maps the three scales of data solution PCG builds, with the specific platforms and operational characteristics that apply to each.

Factor	Small-Scale	Mid-Scale	Large-Scale
Typical users	1 to 5 concurrent users, single location	5 to 50 concurrent users, one or more locations	50+ concurrent users, multi-site or enterprise
Data volume	Under 1 million records per table. Flat or lightly relational.	Millions of records. Fully relational across multiple tables.	Hundreds of millions of records. Multi-source. Potentially distributed.
Platforms	Microsoft Access, SQLite, Excel-backed systems	MySQL, Microsoft Access with SQL Server back-end, SQLite for web	MS SQL Server, Amazon AWS/RDS, Azure SQL, Oracle
Transaction logging	Basic. Limited rollback capability.	Moderate. Query logging and some rollback.	Full transaction logging. Complete rollback and audit trail capability.
Query performance	Fast for small datasets. Degrades as records grow past ~100K per table.	Strong for typical business reporting. Handles complex joins across multiple tables.	Optimized for high-volume, high-concurrency queries. Index strategies required.
Migration path	PCG designs small-scale systems with SQL Server migration documented from day one.	Can migrate to large-scale SQL Server or cloud platform as data volumes grow.	Full enterprise architecture. Horizontal scaling available for cloud deployments.
Hardware investment	Minimal. Runs on standard workstation.	Moderate. Dedicated server recommended for multi-user environments.	Substantial. Dedicated database server or cloud infrastructure required.
PCG typical build time	2-6 weeks for standard deployments	4-12 weeks depending on complexity	8-24 weeks for full architecture build-out

What is a data warehouse and when does your operation need one?

A data warehouse is a centralized repository that consolidates data from multiple source systems into a single location structured specifically for analysis and reporting. It is not a replacement for operational databases. It is a separate layer that pulls from them, resolves the inconsistencies between them, and presents a unified view of the organization's data for business intelligence purposes.

The decision to build a data warehouse is driven by a specific operational problem: leadership cannot get a current, accurate picture of business performance because the data that would answer their questions lives in four different systems that do not communicate. The warehouse solves that by being the single source of truth that all four systems feed.

Handles Big Data

A data warehouse is built to handle data volumes that would overwhelm operational databases. It is designed for read-heavy workloads: complex analytical queries that join across years of historical records, aggregation queries that summarize millions of transactions, and reporting queries that run across the entire dataset simultaneously.

Consolidates Multiple Sources

Data from operational systems, external feeds, historical archives, and third-party platforms is extracted, transformed to a consistent format, and loaded into the warehouse on a defined schedule. The transformation step resolves the inconsistencies between source systems: different date formats, different customer ID structures, different product naming conventions.

Shared Access Across Applications

The warehouse exposes its data to reporting tools, business intelligence platforms, compliance systems, and analytical applications through a consistent interface. Multiple applications can query the same data simultaneously without affecting the performance of the operational systems that feed it. Metadata within the warehouse documents what each data element means and where it came from.

How do you prepare to identify the right data collection solution for your business?

Most data collection problems are not solved by choosing the right software. They are solved by understanding what data the business actually needs and what it is currently doing with the data it has. PCG works through these four preparation steps with every client before recommending a solution architecture.

Audit the data you are currently collecting

Map every data source in the organization: what is being collected, where it is stored, in what format, and by whom. This audit almost always surfaces data that is being collected in multiple places in incompatible formats, data that is collected but never used, and operational questions that leadership asks regularly but the current system cannot answer. The gap between what the data can answer and what the business needs to know is the requirement for the new system.

Separate active data from dead data

Not all data in your current system needs to migrate to a new one. Dead data is data that was collected for a purpose that no longer exists, data that is so structurally inconsistent it cannot be reliably used for any analysis, and data that is retained only because nobody has decided to remove it. Migrating dead data inflates the scope and cost of the new system without adding analytical value. PCG helps identify what moves forward and what gets archived or removed.

Define the outputs before designing the inputs

The reports, dashboards, compliance documents, and analytical outputs the organization needs to produce determine what data must be collected and in what structure. A system designed without knowing what it needs to produce will require structural changes when reporting requirements become clear after deployment. PCG establishes the required outputs before designing the collection structure, not after.

Map the priority order and the data flow sequence

Not all data is equally important. Some data drives decisions that affect revenue, compliance, or operational continuity. Other data is useful but not critical. PCG helps prioritize which data elements are required from day one versus which can be added in subsequent phases, and maps the sequence in which data flows through the organization to identify where collection happens and where it needs to happen for the system to work correctly.

What data collection and management services does PCG provide?

Full-scale to small-scale data systems. PCG builds data collection and management systems for single-user small business deployments through multi-source enterprise data warehouses. The architecture matches the actual scale of the data problem, not a predefined tier.
Data integrity, confidentiality, and security. PCG builds access controls, encryption, audit logging, and backup procedures into every data system from the design stage. For organizations with regulatory compliance requirements under HIPAA, EPA regulations, or financial compliance frameworks, these requirements are specified during design and verified during testing.
Secure web services for database access. Web-accessible database interfaces with role-based access control, encrypted connections, and session management. Staff query and update the database through a browser without requiring client software installation on every machine.
Multi-user access control. Access controls defined at the record level, the field level, and the function level. The compliance officer's view of the database is different from the operations manager's view, which is different from the data entry clerk's view, because each role has different data access requirements that the system enforces automatically.
Backup management and disaster recovery. Automated backup schedules with verified restore testing, off-site backup storage, and documented recovery procedures that define exactly how long it takes to restore the system to a specific point in time. For PCG-hosted systems, backup and recovery is included in the hosting arrangement.
Inventory management across multiple platforms. PCG builds inventory management systems on Microsoft Access for small operations, MySQL and SQLite for web-integrated systems, and SQL Server for high-volume multi-site deployments. The platform is chosen based on the inventory operation's scale and access requirements, not on PCG's platform preferences.
Data transformation and presentation management. Converting raw collected data into the formats required by reporting tools, compliance systems, and analytical platforms. This includes scheduled data exports, automated report generation, dashboard feeds, and API outputs for connected applications.

¹ PCG data collection and management system history documented from project records across all scales and industries, 1995-2026.

² Platform recommendations based on PCG deployment experience across small, medium, and large-scale data environments. Platform suitability thresholds reflect observed performance characteristics, not vendor specifications.

Frequently Asked Questions

Spreadsheets stop being adequate for data collection when more than one person needs to edit the same data simultaneously, when the volume of records makes the file slow to open or query, when the business needs to relate data across multiple entity types that spreadsheets store in separate tabs, or when data entry errors are producing wrong outputs that are difficult to trace and correct. Any of these conditions signals that the data problem has grown beyond what a flat file can support. PCG's typical entry point for these engagements is an organization that has already tried to make the spreadsheet work for longer than was practical.

An operational database is designed for transactions: inserting, updating, and querying individual records quickly. A data warehouse is designed for analysis: aggregating large volumes of historical data from multiple source systems and supporting complex analytical queries that would slow an operational database. Most organizations need the operational database first. The data warehouse becomes relevant when the organization needs to analyze data across multiple operational systems simultaneously and the data in those systems is inconsistent in format or structure.

Yes. PCG builds unified collection systems with a shared SQL Server or cloud database back-end that accepts input from desktop forms, browser-based interfaces, and mobile applications simultaneously. Field staff capture data on mobile devices with offline capability when connectivity is unavailable, and data syncs to the central database on connection. All input, regardless of source, writes to the same database with the same validation rules applied at each entry point.

PCG builds transformation rules that convert each source system's format to a standard schema before loading into the central database or warehouse. Date formats, code value translations, field splits and merges, and null handling are all addressed in the transformation layer. The transformation rules are documented and tested against real source data before the system goes live. When source systems update their formats, the transformation rules are updated to match rather than requiring changes to the central database schema.

Small-scale systems built on Access or SQLite for single-site, small-team operations typically run between $5,000 and $15,000. Mid-scale SQL Server systems with multiple modules, role-based access, and reporting typically run between $15,000 and $50,000. Large-scale enterprise systems with multi-source data integration, warehouse architecture, and advanced analytics typically run between $40,000 and $150,000 or more. PCG provides a fixed-price estimate after the data audit and requirements analysis.

Yes. Data migration is a standard component of every new data collection system PCG builds. Existing data from spreadsheets, legacy databases, or previous systems is audited for quality, cleaned where necessary, mapped to the new schema, and loaded with full validation and post-migration reconciliation. Data quality issues that would cause problems in the new system are identified and corrected during migration rather than carried forward.

Yes. PCG provides long-term support for every data collection and management system it builds. Support covers emergency response when something breaks, modifications as data requirements change, performance tuning as data volumes grow, backup verification, and compatibility reviews before major infrastructure updates. For PCG-hosted systems, backup management, monitoring, and maintenance are included in the hosting arrangement.

About the Author

Allison Woolbert, CEO and Senior Systems Architect, Phoenix Consultants Group

Allison has been designing data collection and management systems since the early 1980s, predating PCG's founding in 1995. Her work spans every scale described on this page: small Access databases for family businesses, mid-scale SQL Server systems for manufacturing and environmental operations, and enterprise-level data collection platforms for ExxonMobil, Nabisco, and AXA Financial. The EPA pesticide inspection and case tracking system PCG built in 2004 has been in continuous production since January 2005 with practically zero downtime.

The lesson that holds across every scale: the quality of the data a system produces is determined at the collection stage, not the reporting stage. A database built on top of poorly structured data collection produces reports that require manual correction before they can be trusted. PCG builds the collection structure and the reporting structure together, so neither undermines the other.

// Not Sure Where to Start?

Data Collection and Management

What questions should you answer before choosing a data collection solution?

Which data collection and management scale is right for your operation?

What is a data warehouse and when does your operation need one?

How do you prepare to identify the right data collection solution for your business?

What data collection and management services does PCG provide?

We Can Manage Your Database!

NEED A CONSULTATION?

Solutions

Data Management

Small Data Systems

Additional Services

Our Company

Subscribe