Xenia Crawler Information

Page URL: https://xeniadata.com/crawler Page version: v1 (Day 1 draft, 2026-05-13) Audience: Webmasters, site operators, ToS reviewers, opposing counsel Required by: Xenia Legal Framework v1.2 Section 4.4.2

What Xenia is

Xenia is a hotel data aggregator operated by Xenia Data . We build a structured, fact-only index of public hotel information to help travelers find independent and boutique properties that match their needs. We are NOT a re-publisher of third-party reviews, descriptions, or photographs.

How Xenia's automated systems operate

Xenia's automated data-collection systems are designed as a Transient Observation Architecture (TOA). The legal anchor is the Authors Guild v. Google / Field v. Google / hiQ Labs v. LinkedIn / Meta v. Bright Data line of cases. The technical and operational characteristics:

We do NOT bypass authentication. Our crawlers never attempt to log in, never use real user credentials, never accept Terms of Service through any clickwrap, and never set session cookies. If a page requires login, our crawlers stop.
We do NOT defeat anti-bot controls. We never solve CAPTCHAs, rotate IPs to evade rate limits, spoof browser fingerprints, or misrepresent the nature of our requests.
We do NOT store source HTML. Our crawlers fetch a public page, extract structured factual data into our schema, and discard the source response. We do not maintain a copy of the source material beyond the transient processing window.
We obey robots.txt. Every fetch begins with a robots.txt check. We respect Disallow, Crawl-delay, noindex, and noarchive directives.
We rate-limit politely. Default: 1 request per source domain every 2 seconds, with jitter, and a maximum of 1 concurrent connection per domain.
We honor 429, 403, and cease-and-desist responses immediately. Backoff is automatic; we do not retry through alternative IPs or identities.

Our identifying user-agent

All Xenia automated fetches identify themselves with the following user-agent string:

Xenia-Crawler/1.0 (+https://xeniadata.com/crawler)

If you see traffic from this user-agent, it is us. If you see traffic claiming to be us but from a different IP range than those listed below, please report it to abuse@xeniadata.com — that traffic is impersonation, not Xenia.

IP ranges used by Xenia

Current IP ranges from which Xenia originates automated requests:

CIDR range	Provider	Effective date
(To be populated upon Day 7 Cloudflare Workers deployment)	Cloudflare	2026-05-19 (target)

This table is updated whenever our infrastructure changes. The current canonical machine-readable form is at: https://xeniadata.com/crawler/ip-ranges.json (live at deployment).

Contact for opt-out, cease-and-desist, or rate limiting

If you wish to: - Opt your domain out of Xenia's automated observation entirely, - Request a custom crawl rate (e.g., once per week instead of once per day), - Issue a cease-and-desist for one or more URLs or paths, - Report suspected impersonation of the Xenia user-agent,

please email:

crawler@xeniadata.com

We commit to acknowledging your request within 2 business days and to honoring it within 7 business days. We log every opt-out request to a permanent record and we do not retry domains we have been asked to stop crawling.

For copyright takedown notices (DMCA), see our DMCA designated agent page instead.

What we collect

Xenia collects facts from public web pages. Examples:

The hotel's name, address, geographic coordinates, phone number
Star or class rating where assigned by an official body (AAA, Forbes, Michelin, government)
Amenity presence (pool, gym, restaurant, parking, Wi-Fi)
Check-in / check-out times where publicly stated
Pet policy, smoking policy, family policy where publicly stated

We also use licensed APIs (Cloudbeds for properties under management, Google Places API where licensed) and we collect first-party data through our own onboarding flow with explicit license grants from property owners.

What we do NOT collect

Verbatim review text from any source
Verbatim hotel descriptions, marketing copy, or editorial content
Professional photographs sourced from third parties without documented license
Personal data of identifiable individuals (guest names, employee personal information beyond a publicly listed work contact)
Content gated behind authentication, paywall, or any access control
Any source material from a pirate library or shadow archive

Our compliance posture

Xenia's full compliance framework is published at https://xeniadata.com/methodology and includes:

Three-tier attribute classification (Verified Fact / Self-Reported / Xenia-Derived Inference)
Display language conventions for derived inferences
Property right-to-respond mechanism for any Xenia-derived claim about a property
Memorization testing for AI-derived outputs
Audit logging of every data action with immutable retention

A copyright agent is designated with the U.S. Copyright Office per DMCA § 512. See https://xeniadata.com/dmca.

Last updated

2026-05-13. Material changes to this page will be announced at least 7 days in advance to any party that has subscribed to crawler-information updates