How we compute fair online prices
Every number on this site is reproducible from public online listing data. This page documents exactly how the pipeline works so you can decide how much weight to give our numbers.
WagaWise is not a marketplace and does not sell products. We analyze public online listings and price signals to help buyers compare products, understand fair online price ranges, and avoid overpaying. Our estimates are guidance, not official prices.
Based on recent public online listings. Offline prices may vary. Negotiated prices, regional offline prices, and private seller prices may differ from the online listings we analyze.
Collection
We read public-facing product listings from a small list of vetted Ethiopian sources. For every source we:
- Review the site's terms of service and
robots.txtbefore enabling it. - Identify ourselves with a User-Agent and contact email.
- Throttle requests (default 2.5 s between fetches per host).
- Never bypass paywalls, logins, captchas, or anti-bot systems.
- Never copy product photos or full descriptions — only title, price, city, condition, source URL, and collection date.
Each listing is stored once. Duplicate detection uses a SHA-256 hash of source URL + title + price, so a listing that's re-fetched tomorrow doesn't double-count.
Normalization
Raw titles are messy ("Iphone 14 promax 256 gb used", "14PM 256 slightly used"). We parse them into structured fields — brand, model, variant, storage, RAM, condition, city, price-in-ETB — using a regex-driven rule set. When an OPENAI_API_KEY is configured, the rules' best guess is passed to an LLM that produces a strict JSON object, which is validated against a zod schema. If the LLM call fails or returns invalid data, we fall back to the rule-based output.
Normalized listings are then matched to a canonical product via: exact brand+model+variant → exact brand+model → alias substring → fuzzy Jaccard over the canonical name. Listings that fail all four steps appear in the admin "Unmatched" queue for review.
Outlier removal
Ethiopian listings include occasional bogus prices (a missing zero, a placeholder "10 ETB"). Before computing fair prices we remove outliers using two filters in combination:
- IQR fence — drop anything outside [Q1 − 1.5·IQR, Q3 + 1.5·IQR].
- Hard ratio caps — drop anything below 25% of the median or above 4× the median.
We keep the cleaned distribution for everything that follows. The same filter is applied to the price-distribution chart on each product page, so what you see is what we compute against.
Stratification by condition
A used phone and a new phone of the same model have very different fair prices. Mixing them into one median tells you nothing useful. We compute fair-range stats once per condition:
all— every listing, regardless of condition.new,used,refurbished,unknown— only listings tagged with that condition.
Per-condition stats are only written when the bucket has at least 3 listings — below that, the number would be noise. Each product page shows the available strata as chips; pick the one you're shopping for.
Fair range, median, and confidence
For each stratum we publish:
- Median — the price most listings ask. We display this as the headline number because the average can be skewed by extreme listings.
- Fair range — the inter-quartile range (P25 – P75). The middle 50% of honest listings sit here.
- Suspicious-low threshold — 75% of P10. A listing below this may indicate fraud, incorrect specs, damaged condition, a locked device, unpaid loans, or another issue. We flag, never accuse.
- Overpriced threshold — 125% of P90. Listings above this are well above the recent online range; the seller may be over-asking.
Confidence is derived from listing volume and source diversity:
- High: 25+ listings from 3+ sources.
- Medium: 10+ listings from 2+ sources.
- Low: fewer than 10 listings or only one source.
Treat Low confidence as a hint: the fair range is plausible but the sample is too thin to bet on. Confirm against at least one other source before paying.
What we don't claim
- We are not a marketplace. We don't take payments and we don't broker deals.
- We cannot guarantee any listed device's authenticity, condition, or legal status. Always inspect in person, check IMEI, and verify warranty.
- Our fair ranges are statistical, not prescriptive. A skilled negotiator regularly beats them; a careless buyer can pay above them.
- Stale snapshots are clearly labeled. If a product shows "Updated 9 days ago", treat the number as a starting point, not the current market.
Methodology last reviewed for the MVP release. Material changes will be noted here.