How iCruise.com defeated web scrapers with Distil Networks
By cameron in Uncategorized
Travel suppliers, airlines, hotels, OTAs, and metasearch sites are constantly scraped and abused by bad bots. These automated visitors inflict long-term SEO damage, pillage customer accounts, skew look-to-book ratios, inflate GDS fees, and rob you of direct and ancillary revenue.
Tnooz, Distil Networks and WMPH Vacations (of iCruise.com fame) recently held a workshop to help industry pros better understand what’s at stake and how to solve their bad bot problems. By attacking bad bots and preventing illicit web scraping, iCruise.com improved site speed by 40 percent and increased conversion rates by 22 percent.
Distil Networks’ VP of Marketing Elias Terman kicked off the workshop with a data-driven overview of the current state of the bad bot landscape, drawing on their latest Bad Bot Report to highlight the continuing widespread use of web scraping, the recent shift of bad bot activity to mobile and APIs, and new bot-driven booking engine threats such as spinning (aka hoarding).
Here are the high-level things to know about bad bots that scrape away your profits by overusing server resources, slowing site speed for legitimate customers, and waste staff time on managing related IT issues.
Bots go mobile
Bots track trends, and the trend in the travel business is very much towards mobile. The number bad bots impersonating mobile browsers grew by almost 43 percent last year.
With the advent of mobile device farms (e.g. AWS Device Farm and Google Firebase Testing Lab) and mobile device emulators it’s never been easier for the bad guys to abuse mobile apps and the APIs that power them.
Web scraping is still a huge problem everywhere
Web scraping remains the most prevalent automated threat on the internet – 97 percent of all Distil’s customers have been targeted by web scrapers.
It’s incredibly easy to do, and they’re after pretty much anything that can be accessed through your website or APIs– or your partners’:
- Customer data, contact information, and credit card data
- Pricing, availability, vendor, and partner information
- User reviews, editorial content, photos, venues, itineraries, and tours
- GDS API pulls, vendor and/or partner information
- Keyword and SEO strategies
- Other online travel business threats
Going beyond web scraping, spinning is a new threat specifically targeting the APIs that power mobile booking applications.
Mobile device emulators are used to continuously hold seats in the airline booking engine, but not buying anything until they find the right price on a secondary market – a strategy that leaves planes with unsold seats and operators without their margin-critical add-on sales.
Site slowdowns and downtime
One-third of all sites get hit with unexpected traffic spikes (measured as three times their rolling average).
These spikes often result in site slowdowns and downtime – a potentially disastrous scenario for a business where reduced response time almost always means a lost sale. These application layer (aka business logic attacks) aren’t even visible to most web security tools.
Account-based fraud
And of course, bots love login pages!
Armed with billions of credentials (three billion from Yahoo alone) and secure in the knowledge that many of use the same usernames passwords for multiple sites, the bad guys just hammer away until they get through to that stash of payment card data, loyalty points, and other easy money found within our online accounts.
According to Connexions, Airline loyalty points can go for $200 or more on the black market, and 72 percent of loyalty program managers have experienced this type of fraud first hand.
Skewed analytics and poor funnel optimization
Today’s bots are adept at loading external assets like JavaScript, making their activity look like human activity in analytics, A/B testing, conversion tracking, and more.
As a result, decisions are made based on inaccurate data, impacting funnel analysis and optimization, look-to-book ratios, conversion rates, KPI tracking, and infrastructure expansion planning.
Case study: WMPH Vacations
WMPH Vacations is in hurricane country so, to ensure as much uptime as possible, their business is over 90 percent cloud-based.
With 30 websites, nine brands, an intranet-dependent distributed workforce, virtual servers on AWS, a cloud-based phone system, multiple API-based partnerships, and an award-winning mobile app, they present a prime target for the bad guys.
The evidence was overwhelming. There were CPU spikes, API scraping that almost took a partner offline, a constant barrage of SQL injection attacks, skewed metrics, and spammed forms that bogged down customer service phone queues.
But the company had almost no insight into why this was happening in order to figure out a way to stop it.
Antoine Zammit, VP of Technology at WMPH Vacations, had tried to solve the problem on their own. He tried the following without much success:
- CAPTCHAs impacted the user experience and can be defeated by advanced persistent bots
- Log analysis couldn’t differentiate between sophisticated bots and real humans
- IP blocking and rate limiting just slowed things down and were overwhelmed by all but the most basic bots
By working with Distil, WMPH has been able to protect their business, web properties, apps, and partners from fraudulent actors.
Their leads are up 100 percent, response time is up 40 percent, conversion rates are up 22 percent, and their development team has 20 more hours a month to invest in growing the business. Those metrics matter, and have created dramatic bottom-line results.
As an added bonus, the company has been able to share Distil’s bot activity reports with their partners, adding to everyone’s peace of mind.
Photo by James Pond on Unsplash.