OAI-PMH
The Archives of Surgery and Clinical Research (ASCR) exposes article and issue metadata via the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). This page explains the endpoint structure, supported formats, and best practices so libraries, discovery services, and repositories can integrate our records reliably.
Overview & Intended Use
OAI-PMH provides a uniform, standards-based way to harvest ASCR metadata for indexing, link-resolving, and preservation workflows. Harvesters submit HTTP requests with a small set of verbs (Identify
, ListSets
, ListMetadataFormats
, ListIdentifiers
, ListRecords
, GetRecord
); responses are well-formed XML. We support broadly interoperable formats so that discovery services, library catalogs, and research repositories can keep their records current without scraping HTML or PDFs.
Endpoint & Access
Our OAI-PMH base URL (baseURL) is available upon request to ensure stable integrations and to prevent automated probing. If you are configuring a harvester, contact the editorial office (see Contact) and share your IP/domain, preferred schedule, and metadata format. We will confirm the baseURL and any set specifications that fit your use case.
Typical endpoint patterns
- OJS-style:
https://www.clinsurgeryjournal.com/index.php/ascr/oai
- Alt path:
https://www.clinsurgeryjournal.com/oai
Your integration email will specify the exact baseURL to use.
Supported Verbs
Verb | Purpose | Key parameters |
---|---|---|
Identify |
Repository identity and capabilities | None |
ListMetadataFormats |
Available formats (e.g., oai_dc , jats where enabled) |
identifier (optional) |
ListSets |
Partitioning (e.g., journal sections, issues) | None |
ListIdentifiers |
Headers only (for lightweight crawling) | from , until , metadataPrefix , set , resumptionToken |
ListRecords |
Full records for harvesting | Same as above |
GetRecord |
Single record by identifier | identifier , metadataPrefix |
Metadata Formats
To balance broad interoperability with rich, domain-specific description, we support at least one baseline format and, where applicable, JATS XML for article-level metadata.
- Dublin Core (
oai_dc
) — minimal, widely supported fields (title, creators, description/abstract, publisher, date, type, format, identifier, source, language, relation, coverage, rights). - JATS (
jats
) — richer article structure (contributors with roles, affiliations, abstracts by section, funding data, references, license information, clinical trial IDs). Availability depends on platform configuration.
Field mapping (indicative)
Article element | Dublin Core | JATS (example) |
---|---|---|
Title | dc:title |
<article-title> |
Authors | dc:creator |
<contrib contrib-type="author"> |
Abstract | dc:description |
<abstract> |
DOI | dc:identifier |
<article-id pub-id-type="doi"> |
License | dc:rights |
<license xlink:href="https://creativecommons.org/licenses/by/4.0/"> |
Publication date | dc:date |
<pub-date> |
References | (often omitted) | <ref-list> with <mixed-citation> or <element-citation> |
Exact fielding will reflect the underlying platform’s export configuration.
Sets & Scoping
OAI-PMH sets allow harvesters to request subsets of the repository (e.g., by journal section or issue). Common patterns include a set for each Section (Original Research, Reviews, Case Reports) and for each Issue/Volume where the platform supports it.
- Use
ListSets
first to enumerate available setSpecs. - Scope by date using
from
anduntil
with day-level granularity (UTC). - Combine set scoping with
metadataPrefix
to minimize payloads.
Sample Requests
Identify the repository
GET {baseURL}?verb=Identify
Confirms repository name, earliest datestamp, granularity, and admin email.
List metadata formats
GET {baseURL}?verb=ListMetadataFormats
Shows available metadataPrefix
values such as oai_dc
and jats
(if enabled).
Incremental harvest (Dublin Core)
GET {baseURL}?verb=ListRecords&metadataPrefix=oai_dc&from=2025-01-01&until=2025-12-31
Retrieves records modified in the date window. Follow any resumptionToken
to continue.
Single record by identifier
GET {baseURL}?verb=GetRecord&identifier=oai:ascr:article-1064&metadataPrefix=jats
Returns one record with enriched fields (where JATS is available).
Resumption Tokens & Paging
Large result sets are split across multiple responses. The repository returns a resumptionToken
with an optional size hint and expiration. Your harvester should:
- Cache the last successful token and resume from there after any connectivity issues.
- Avoid mixing parameters with a token (per the spec, pass only
verb
andresumptionToken
). - Throttle follow-up requests (see Throttling).
Dates, Granularity & Time Zones
ASCR supports day-level granularity (YYYY-MM-DD
). Timestamps in responses are UTC. If your pipeline runs more than once per day, include a one-day overlap in the from
/until
window to avoid missing updates made near midnight boundaries. If your system stores local times, normalize to UTC before constructing requests.
Deleted Records & Status Flags
OAI-PMH repositories signal record status in the header. While ASCR rarely removes records, updates can include corrections, retractions, or expressions of concern. We recommend:
- Honor the
status="deleted"
flag if present (keep a tombstone or remove, per your policy). - Parse relation links in JATS (or provided fields) to interlink original articles with notices.
- Prefer the DOI landing page for citation linking; use OAI identifiers for harvesting only.
Persistent Links & Licensing
Each article includes a DOI in canonical form (https://doi.org/…
) and a clear license statement (typically CC BY 4.0). Preserve these fields during ingestion so your users see correct reuse permissions and stable links. If your discovery layer re-renders abstracts, retain the attribution and the license line.
Best Practices for Harvesters
Scheduling & Throttling
- Full harvest on first run; incremental runs daily or weekly depending on your freshness needs.
- Delay 1–2 seconds between paged requests; back off exponentially on HTTP 429/503.
- Use
ListIdentifiers
to detect changes quickly; fetch full records withGetRecord
as needed.
Validation & QA
- Validate XML against OAI-PMH and (for JATS) NISO schemas.
- Check that each record carries a DOI, license URL, and at least one creator.
- Prefer controlled vocabularies (e.g., MeSH) when present; do not strip identifiers such as ORCID or ROR.
Normalization Tips
- Map
dc:identifier
values to distinct types (DOI vs URL vs OAI) to avoid duplicate linking. - Preserve author order; maintain affiliations and country fields where present.
- Translate license URIs to human-readable badges in UI, but keep the machine-readable link.
Troubleshooting & Common Errors
Symptom | Likely cause | Resolution |
---|---|---|
HTTP 400 with badArgument |
Malformed parameters | Check spelling, omit extra params when using resumptionToken . |
Empty ListRecords response |
Date window too narrow; wrong metadataPrefix |
Broaden from /until ; use ListMetadataFormats . |
cannotDisseminateFormat |
Unsupported metadataPrefix |
Fall back to oai_dc or enable JATS in coordination with us. |
HTTP 503 Retry-After | Server throttle | Honor Retry-After header; slow subsequent requests. |
Missing references in DC | DC is minimal | Use JATS format where available for reference lists and funding data. |
Security & Responsible Use
Respect the server and our users by following polite harvesting practices. Do not attempt to scrape PDFs through the OAI-PMH interface, and do not store email addresses or other non-public personal data beyond what is included in public metadata. If you operate a shared harvester, include a recognizable User-Agent
string and a contact email so we can reach you about operational issues.
Frequently Asked Questions
Can I rely on OAI-PMH for full-text?
No. OAI-PMH transmits metadata. Follow the DOI or article URL for HTML/PDF. For text-and-data mining, see our Open Access Statement and repository guidance.
What should I index as the canonical link?
Use the DOI (if available) as the canonical link and render the journal landing page as the source of record. Keep the OAI identifier strictly for harvesting.
How often should I harvest?
Weekly works for most catalogs. If you display “early view” content or metrics dashboards, consider daily, incremental runs.
Do you provide Crossref metadata directly?
We register articles with Crossref; OAI-PMH complements that by providing a consolidated feed aligned with our site records. Many indexers merge OAI-PMH with Crossref and other sources.
Contact for Integrators
To request the baseURL, enable JATS, or coordinate a test harvest, contact: production@clinsurgeryjournal.com · editorial@clinsurgeryjournal.com