Blog
Long-Term Retention of Bioanalytical Watson LIMS™ Studies: A Compliance-First Perspective
Most bioanalytical labs running Watson LIMS™ have never made a conscious decision about archiving — they have deferred. What 21 CFR Part 11 and EU GMP Annex 11 actually require — and why format independence is a regulatory posture, not an IT preference.

Executive Summary
Most bioanalytical laboratories running Thermo Watson LIMS™ have, by default, adopted a data retention strategy that was never designed to be a long-term archive. Keeping Watson running because completed studies are still held within it is not an appropriate archive solution — it is a deferred decision. And that decision grows more expensive, more fragile, and more difficult to justify with every passing year.
This paper examines three questions that QA Directors and Bioanalytical Lab Directors working with Watson™ environments regularly face:
- What do 21 CFR Part 11 §§11.10(b) and (e) and EU GMP Annex 11 §7 actually require of a long-term study archive — and what do they not require?
- Does Watson™’s own archiving functionality satisfy those requirements?
- What does a genuinely regulator-ready, vendor-independent long-term archive look like, and how is data integrity maintained across a 15-year retention horizon?
The answers matter for two reasons. First, regulatory guidance on electronic records and data integrity is tightening: EMA and PIC/S launched a joint public consultation on data management and integrity in February 2026, keeping the topic firmly on QA leadership agendas. 1 Second, the practical risks of leaving this question unaddressed are compounding — Oracle™ support escalation, Windows Server end-of-support, post-merger duplicate systems, and the slow erosion of institutional knowledge about legacy infrastructure.
The conclusion is straightforward: the decision to archive is a compliance decision, not an IT decommissioning decision. And the format in which data is archived determines whether that compliance is robust or contingent.
The Compliance Baseline: What the Regulations Actually Require
Before evaluating any archiving approach, it is worth reading the relevant regulatory text directly, rather than through the lens of received interpretations.
CFR Part 11 §11.10(b) — Human-Readable and Electronic Copies
ICH M10 states that all method validation data and analytical results supporting regulatory submissions must be fully documented, including failed runs and out-of-specification runs. Section 11.10(b) requires that regulated systems be capable of generating “accurate and complete copies of records in both human-readable and electronic form suitable for inspection, review, and copying by the agency”.
| §11.10(b) defines the output requirement — readable, complete, accurate copies. It says nothing about which system must hold the records in order to produce them. A well-constructed archive of PDF/A documents, CSV, and XML satisfies §11.10(b) without Watson™, without Oracle™, and without any proprietary viewer. |
This distinction matters operationally. Keeping Watson LIMS™ running to satisfy this requirement is one way to meet it — but it is not the only way, and it is arguably not the most robust way over a retention horizon measured in decades.
21 CFR Part 11 §11.10(e) — Audit Trail Integrity
Section 11.10(e) requires “computer-generated time-stamped audit trails to independently record the date and time of operator entries and actions that create, modify, or delete electronic records”. The audit trail must be retained for at least as long as the subject electronic records themselves.
An archive that captures the Watson™ audit trail — user identity, timestamp, and action type for every entry Watson™ recorded against the study, including study reopens, result rejections, reassay events, and approval withdrawals — satisfies this requirement. The criterion is completeness: the archive must contain what Watson™ held, without filtering, aggregation, or summarisation.
EU GMP Annex 11 §7 — Data Storage
Annex 11 §7 requires that data be stored “in a way that is accessible, readable, and accurate throughout the defined retention period”. Backup integrity must be tested. For electronic records, the storage medium must be validated, and the data must remain readable regardless of changes to the underlying technology.
| The phrase “regardless of changes to the underlying technology” is the operative standard for format selection. An archive whose readability depends on the continued availability of a specific Watson™ version, Oracle™ Enterprise Edition release, or Windows Server build does not robustly satisfy this requirement. |
21 CFR 58.195 — GLP Retention Periods
Under GLP, 21 CFR 58.195 specifies minimum record retention periods of two to five years post-submission. Industry practice for raw data, calibration curves, and audit trails typically extends to 15 years or longer. Across that horizon, the question of format independence is not theoretical — it is the central engineering challenge of compliant long-term retention.
“But Watson™ Already Has a Study Archive” — Why That Is Not Enough
This is the first response most QA teams give when the subject of archiving is raised, and it deserves a direct answer. Watson™ does include a Study Archive function. For teams that have not closely examined their technical characteristics, it is easy to assume that running Watson Archive™ satisfies the long-term retention requirement. The gaps are real and material.
Coverage Is Version-Dependent
The completeness of a Watson Study Archive™ export varies across Watson™ versions. In several versions, the function does not export Sample Information. This is not a minor omission: sample identity, sample matrix, nominal concentration, and related metadata are core elements of the Watson™ study record. A regulatory query concerning sample handling — which samples were reassayed, which were excluded, and on what basis — cannot be fully answered from an archive that does not contain this data.
The coverage question is not resolved by confirming that a particular site runs a version where certain data happens to be included. The underlying issue is structural: the completeness of the archive depends on the Watson™ version in use at the time of the archive run, and that version will eventually cease to be supported.
Proprietary Format — No Long-Term Readability Guarantee
The Watson Study Archive™ output is based on a proprietary Microsoft data format. This is not an ISO-standardized long-term preservation standard. Its future readability depends on the continued availability of the software stack that reads it.
| ISO 19005 (PDF/A) was developed specifically to address the “still readable in 20 years” problem. It prohibits features that depend on external dependencies and requires that all resources needed to render a document are embedded within the file itself. A proprietary Microsoft-based format carries no equivalent long-term readability guarantee — and Annex 11 §7 requires readability regardless of changes to the underlying technology. |
For a retention obligation that extends 15 years or more, this is a structural compliance risk, not a hypothetical one. Technology platforms do not stand still across 15-year horizons. Neither do the organizations that develop them.
The Practical Consequence
Teams that rely on Watson™’s built-in Study Archive are, in effect, making the compliance of their long-term retention contingent on three things remaining stable: the Watson™ application itself, the Microsoft format dependency, and the completeness of the export in whichever Watson™ version they happen to be running. None of these is a regulatory requirement, and none is within the regulated party’s control once the vendor moves on.
The Silent Risk: Completed Studies Still Living in Production Systems
The most common long-term retention posture in bioanalytical labs is not a posture at all — it is the absence of one. Completed studies remain in the Watson™ production environment because no decision has been made to move them, and because the costs and risks of leaving them there have not been systematically assessed. Those costs accumulate quietly.
Infrastructure Dependency
Watson LIMS™ requires Oracle Database™ (Standard or Enterprise Edition). Oracle™ support terms escalate annual support fees at approximately 8% per year. Windows Server 2016 — a common underlying platform in Watson™ environments — reaches the end of mainstream support in January 2027. Extended Security Updates are available, but their cost doubles annually. Validated GxP hardware ages out on a five- to seven-year refresh cycle.
Each of these creates a decision point: either invest to extend the legacy stack’s life or address the underlying retention question directly. Labs that have deferred the archiving decision face all three pressure points simultaneously.
Personnel and Institutional Knowledge
The individuals who understand how a specific Watson™ environment is configured, which studies are in which status, and what the system’s validation history looks like are typically a small group. As those individuals leave, the operational overhead of maintaining the system — and of responding accurately to regulatory queries about specific studies — increases. An archive structured for readability without Watson™ expertise transfers that knowledge into the record itself. It does not depend on organizational continuity.
Post-Merger Complexity
Pharmaceutical M&A activity in 2024 and 2025 exceeded $70 billion in major announced deals. Each transaction creates a 24-month integration window in which application rationalization is an explicit management objective. Watson™ is a high-frequency duplicate: both parties to many deals run Watson™ environments, and the GxP retention obligation prevents decommissioning either instance without a compliant archive in place. Labs that have already archived their completed studies enter an integration unconstrained by this.
The Inspection Scenario
Consider a regulatory authority raising a data query in 2027 against a bioanalytical study completed in 2013. The query requires producing accurate, complete, human-readable copies of the study data — results, regression, sample records, and audit trail — promptly. If those records are held in a Watson™ instance that has not been fully patched since 2025, that runs on Windows Server approaching end-of-support, and whose validation status has not been formally maintained, the response to that query carries a set of risks that would not exist if the study had been archived in open formats at completion.
What a Regulator-Ready Archive Looks Like: Format and Content
A Watson™ study archive that robustly satisfies §11.10(b), §11.10(e), and Annex 11 §7 needs to meet three criteria simultaneously: completeness (all regulatory-relevant data is present), format independence (no proprietary viewer required to read it), and verifiable integrity (any post-archive alteration is detectable). The following describes what that archive contains and why each element matters.
PDF/A — Human-Readable, ISO Long-Term Preservation Standard
PDF/A (ISO 19005) is the international standard for electronic document archiving. Unlike standard PDF, PDF/A prohibits external dependencies: fonts are embedded, color profiles are self-contained, and the file carries no runtime requirements. A PDF/A document opened in 2040 renders identically to one opened today, without requiring the application that created it.
Within the archive, PDF/A documents are structured in two orientations: by run (for analytical inspection, matching the way a reviewer examines a specific analytical event) and by study (for submission review, presenting the complete study record as a regulatory reviewer expects to encounter it). This structure mirrors how regulators actually read study data and directly reduces the response time to an inspection query from a lab team not involved in the original study.
CSV — Structured, Machine-Readable Data
Every dataset in the Watson™ study record is exported to CSV: analytical results, calibration curves, regression parameters (including weighting factor and acceptance criteria), QC datasets, sample result tables, and chromatographic peak summary data. Failed runs are included. Rejected samples are included. Reassay events are included. Deactivation comments are included.
The governing principle is fidelity: the CSV export represents the study as Watson™ holds it at final status. Nothing is filtered, cleaned, or summarised. This is the dataset that answers a sponsor query, supports a regulatory response, or provides input for an aggregate data analysis.
XML — Structured Re-Import and Long-Term Accessibility
XML exports provide the same datasets in a structured, self-describing format suited to programmatic access. XML is the appropriate format for re-importing into a downstream system — a new LIMS, a sponsor’s eTMF, or a data warehouse — and for scenarios in which a regulatory query requires analysis across multiple studies. As a plain-text, open format, XML carries no proprietary dependency and requires no specialized software to read.
Audit Trail — Complete Export Under §11.10(e)
The Watson™ audit trail — both study-level and system-level — is exported in full as HTML/XML. Every user action recorded by Watson™ against the study is present: result acceptance and rejection, run approval, study reopen events, approval withdrawals, with user identity, timestamp, and action type. No entry is summarised, aggregated, or filtered.
This is the element that directly satisfies §11.10(e). The completeness criterion is absolute: if an entry existed in Watson™ at the time of the archive run, it is in the archive. What Watson™ held is what the archive contains.
An Honest Note on Scope
Two categories of data fall outside the scope of a Watson™ study archive and should be addressed explicitly before any decommissioning project begins.
First, raw chromatography data from instrument data acquisition systems — Empower, MassHunter, Analyst — reside in those systems, not in Watson™. Watson™ holds integrated results: peak areas, retention times, and calculated concentrations. The underlying raw files have independent regulatory retention obligations and must be addressed separately through those systems.
Second, system-level data held in Watson outside individual studies — including assay batch records for reference standards and QC samples (Standard, QC, and Stability batch types), which are maintained at the system level rather than within a specific study — is not part of the study archive. Identifying this data and ensuring its separate retention are standard items in any pre-decommissioning project plan.
These are not limitations of the archiving approach — they are accurate descriptions of the Watson™ data model. An archiving process that is honest about scope is more defensible at inspection than one that implies completeness it does not deliver.
Archive Format Comparison
| Watson Study Archive™ (built-in) | Regulator-Ready Archive (PDF/A + CSV + XML) |
| Coverage depends on Watson™ version | Complete, version-independent export |
| Sample Information: absent in several Watson™ versions | Sample Information: always included |
| Proprietary Microsoft-based format | ISO 19005 (PDF/A) + open CSV/XML |
| Coverage depends on the Watson™ version | Readable on any computer, permanently |
| No cryptographic integrity verification | Per-file SHA hash + hash-of-hashes manifest |
| Audit trail completeness: version-dependent | Audit trail: complete, HTML/XML export |
Cryptographic Data Integrity: Answering the Inspector’s Hardest Question
A complete, well-formatted archive addresses most regulatory questions. There is one question it does not answer on its own: how do you demonstrate that the archive today is identical to the archive as it was created — that no file has been altered, added, or removed in the intervening years?
This is the chain-of-custody question for electronic records, and it is increasingly the focus of data integrity guidance from both the FDA and the EMA. The answer lies in cryptographic hashing.
How It Works
- Each file in the archive receives a cryptographic SHA hash at the time of creation. The hash is a fixed-length fingerprint derived entirely from the file’s content. Any subsequent alteration to a single byte produces a completely different hash.
- A manifest file lists the hash for every file in the archive. The manifest documents the complete expected state of the archive at the moment of creation.
- The manifest itself is hashed — producing a “hash of hashes” — and that value is recorded both within the archive and in the processing protocol logged by StudyReporter at the time of the archive run. It exists in two independent locations.
To verify archive integrity at any subsequent point, the hashes of all files are recalculated and compared against the manifest. If every hash matches, the archive is provably unaltered. If any hash differs, the specific file that has changed is immediately identifiable. No assertion of trust is required — the verification is mathematical and reproducible by any party with access to the archive.
What This Means at Inspection
For an inspector asking, “How do you know this archive has not been modified since it was created?”, the answer is a documented, auditable verification: recalculate the hashes, compare them against the manifest, and confirm they match. The processing protocol logged during the archive run establishes the chain of custody.
| A static archive with cryptographic integrity verification does not require periodic revalidation. There is no configuration, no update cycle, and no process that could alter the archive between reads. The hash manifest verifies integrity at creation; that verification can be re-run at any time without a revalidation cycle. |
One operational note: if the archive is migrated to a new storage infrastructure, the hash manifest should be re-verified against the original before the source is retired. This is a single verification operation, not a revalidation event.
In Practice: Boehringer Ingelheim
Boehringer Ingelheim is an active reference customer for the StudyPlus Archive Module. Their approach illustrates a compliance posture that is increasingly common among large pharmaceutical organizations that have thought systematically about the long-term retention question.
Rather than waiting for a decommissioning decision or a regulatory pressure event, Boehringer archives completed Watson™ studies upon completion. The archive exists independently of the Watson™ environment from the moment it is created. It satisfies the retention requirement regardless of what subsequently happens to Watson™, Oracle™, or the underlying Windows infrastructure.
This is structurally different from keeping Watson™ running and treating continued system availability as the mechanism for satisfying the retention obligation. The latter creates a dependency on a live system; the former eliminates it. The compliance posture is the same in either scenario — the risk profile is not.
For organizations working through this question — still holding completed studies in Watson™ and evaluating their options — Boehringer Ingelheim is available as a peer reference through the up-to-date reference program. Conversations of this kind are typically most useful once the technical and regulatory questions have been framed and the organization is working through implementation specifics.
Validation Considerations
For QA teams evaluating any new process or system, the validation question arises early. In the case of a static-output archiving module, the answer is more straightforward than most.
Vendor Qualification Under GAMP 5
The StudyPlus Archive Module is vendor-qualified to support a GAMP 5 validation process, and up to data GmbH is ISO 9001:2015 certified. The module has been tested and validated against Watson™ 7.3, 7.4, and 7.6.
Under GAMP 5 qualification, the customer’s internal validation obligation can be reduced to a business-cycle test: confirming that a representative set of known studies can be located in the archive, that their content is complete and readable, and that hash verification passes. This is substantially less than a full IQ/OQ/PQ cycle, and the supplier’s qualification documentation is available as part of the engagement.
No Periodic Revalidation for a Static Archive
A validated running system requires periodic revalidation because its configuration can change, its software is updated, and its processes can drift. A hash-verified static archive has none of these characteristics. There is no configuration to maintain, no update cycle, and no process that could alter the archive content between reads.
The hash manifest verifies integrity at the point of creation; that verification can be re-run at any time as part of an inspection preparation, a storage migration, or a scheduled data integrity review. It does not constitute revalidation — it is a verification operation.
This is one of the less obvious advantages of a static-file archive over a retained live system: the validation burden does not compound over time. A running Watson™ environment requires ongoing validation and maintenance. An archive does not.
Decommissioning Documentation
Retiring a validated GxP system requires documented evidence that the process was executed correctly and that data integrity was maintained throughout. A complete package consists of a Decommissioning Specification, Execution Records, and a Decommissioning Report. The Archiving Module generates a processing protocol in StudyReporter for each archive run, which forms part of the execution record. Documentation templates structured to meet these requirements are available as part of the engagement.
Conclusion: The Archive Is a Compliance Decision, Not a Default
The implicit choice most organizations have made — keeping Watson™ running because completed studies are still held within it — is a compliance posture. It is simply one that was never made consciously, and whose costs and risks tend not to surface until they become acute.
21 CFR Part 11 and EU GMP Annex 11 do not require that completed studies be held in the system that generated them. They require that records remain accurate, complete, and human-readable across the defined retention period, and that their integrity can be demonstrated. Those requirements are satisfied by a well-constructed archive of ISO-standardized open formats with cryptographic integrity verification. They are not uniquely satisfied with keeping Watson™ alive.
Watson™’s own Study Archive function is a useful operational tool. It is not a substitute for a format-independent, integrity-verified archive, because its coverage is version-dependent, its output format is proprietary, and its long-term readability is contingent on technology continuity that no vendor can guarantee across a 15-year horizon.
The practical question is not whether to archive — it is when. Organizations that archive at study completion carry the lowest risk at the lowest cost. The archive exists from day one. The retention obligation is met from day one. Whatever decisions follow — about Watson™, Oracle™, Windows, or the broader infrastructure — are unconstrained by compliance risk.
| The decision to archive is a compliance decision. The format in which data is archived determines whether that compliance is robust or contingent on the continued availability of systems, vendors, and software versions that will not outlast the retention obligation. |
References
European Commission (2011). EU guidelines for good manufacturing practice for medicinal products for human and veterinary use: Annex 11, computerised systems. URL: https://health.ec.europa.eu/system/files/2016-11/annex11_01-2011_en_0.pdf.
European Medicines Agency & Pharmaceutical Inspection Co-operation Scheme (2025). Joint stakeholders consultation on the revision of Chapter 4 on Documentation, Annex 11 on Computerised Systems and on the new Annex 22 on Artificial Intelligence of the PIC/S and EU GMP Guides. PIC/S. URL: https://picscheme.org/en/news/joint-stakeholders-consultation-on-the-revision-of-chapter-4.
International Organization for Standardization (2005). ISO 19005-1:2005: Document management — Electronic document file format for long-term preservation — Part 1: Use of PDF (PDF/A). URL: https://www.iso.org/standard/38920.html.
International Society for Pharmaceutical Engineering (2022). GAMP 5: A risk-based approach to compliant GxP computerised systems (2nd ed.). URL: https://ispe.org/publications/guidance-documents/gamp-5-guide-2nd-edition.
U.S. Food and Drug Administration (2026a). 21 CFR Part 58.195: Good laboratory practice for nonclinical laboratory studies — Record retention. Electronic Code of Federal Regulations. URL: https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-58/subpart-K/section-58.195.
U.S. Food and Drug Administration (2026b). 21 CFR Part 11: Electronic records; electronic signatures. Electronic Code of Federal Regulations. URL: https://www.ecfr.gov/current/title-21/chapter-I/subchapter-A/part-11.
up to data has been supporting pharmaceutical and life sciences companies with automated laboratory processes for regulatory study data management for over 20 years. Our solutions eliminate data silos, implement secure automated data transfer processes, and reduce manual activities while ensuring full regulatory compliance.
This content might also be engaging for you!
-
The Post-Merger Watson™ Problem: How to Phase Out Watson LIMS™ Cost-Effectively
Blog The Post-Merger Watson™ Problem: How to Phase Out Watson LIMS™ Cost-Effectively A CFO/CIO guide to decommissioning legacy bioanalytical…
-
Four Dimensions of Time Savings in GxP Bioanalytical Study Reporting
Blog Four Dimensions of Time Savings in GxP Bioanalytical Study Reporting With rising expectations under ICH M10, FDA Part 11, and EMA…
-
Beyond Manual Reporting | Sneak Peak StudyReporter
Blog Beyond Manual Reporting: A Sneak Peek into the Future of Automated Study Data Reporting Discover in our exclusive…

