Fight against Fraud: Amazon Fraud Detector Review

Amazon Fraud Detector is a high-level AI service, which has just gone GA (Generally Available) in several US regions, Ireland, Singapore and Sydney. It was originally launched at re:invent 2019, so we evaluated it shortly after that and this blog was written about our experiences with the preview release.

Inawisdom is an AWS Partner Network (APN) Premier Consulting Partner with the AWS Machine Learning Competency. We work with organisations in a variety of industries to help them exploit their data assets.

Our goal at Inawisdom is to accelerate adoption of advanced analytics, artificial intelligence (AI), and machine learning (ML) by providing a full-stack of AWS Cloud and data services, from platform through data engineering, data science, AI/ML, and operational services.

The cost of online fraud

Online fraud is estimated to be costing businesses more than £3billion a year according to the FBI’s Internet Crime Report 2019. As it reports, the UK is by far the worst affected country by number of victims, excluding the USA. Sadly, even the recent COVID-19 lockdown has led to an increase of attempts to attack vulnerable Internet users. The main risks include:

  • Email Account Compromise (personal or business)
  • Social engineering (Phishing, Vishing, Smishing, etc)
  • New Account fraud
  • Online Account takeover or extortion, e.g. ransomware
  • Non-Payment or Non delivery (including card numbers compromised)
  • Scams, organised crime rings and hacktivist/terrorism

Our fightback against this growth of fraud using post-fraud analyses has identified patterns. Common ones include using a common IP Address, or similar data in fraudulent accounts such as email domains. In other cases, the entered country or residential status or work status (when applying for accounts) is faked. As a result, fraudsters also evolve new behaviours to get around the preventative measures. So, we don’t have to solve one problem, we need a strategy that enables us to be responsive.

Machines reduce fraud

We commonly use machine learning to detect “something fishy” going on. We have grown used to “spam” email filtering. It is the classic example of using training data to recognise and categorise new events (arriving mail) so that an inbound email goes to “Junk”.

Mostly it’s right, but we’ve all suffered from missing some email or other which was mis-categorised. Other common scenarios include detecting likely fraudulent online orders and creation of fake accounts. Our earliest machine learning projects here at Inawisdom included those for detecting insurance fraud. One current focus is augmented intelligence so the machine reduces the burden on humans to check for fraud.

The Amazon Fraud Detector service

Amazon Fraud Detector is a high-level AI service launched at re:invent 2019. It is designed to detect potentially fraudulent online activities and prevent them. I tried it out recently and found it easy to use and flexible enough to be a useful tool to accelerate how AI can be embedded in your online business. You can dynamically adjust the customer experience based on a result from Fraud Detector. Trusted customers who pass the detector get an optimal experience, so they do not suffer because of activity from fraudsters. If the detector returns a positive match, you can ask for further authentication, for example a PIN sent to a mobile number.

There are three model templates, only the first of which is available in the trial service: Online Fraud Insights, Account Takeover and Transaction Fraud. Many more are promised and you can bring your own SageMaker model.

The minimum required training data for the online fraud model is email, timestamp, IP Address and the label of whether data is fraud or not. Any number of other attributes can of course be added for consideration to improve the model. The key challenge is to separate the relevant from the not so relevant to seek out fraudulent activities.

Why ML is better than rules

In the early days of online fraud prevention, we were encouraged to include specific checks; business rules in effect. Business Rules are great for covering specific conditions such as

      If IP_ADDRESS_LOCATION == [‘England’] 
        and CUST_ADDRESS_COUNTRY != [‘England’] 
      THEN CALL ‘AlertInvestigateLocationAnomaly’

With Machine Learning the focus is identifying general patterns by looking at lots of examples. That way it can still work when fraudsters change part of their approach. ML models are therefore not as brittle to changes that would negate rules like the one above.

It’s also important to continue to adapt; to close the loop to confirm fraud labelling and retrain the model, optimising the performance. We still use rules to interpret the model but these rules are based on the output from the ML model. Human fraud detectors will help to tune the rules, so we’re still very important in the process.

The reason for specific modelling is that generic models underperform, and often require time-consuming data transformations. Feature engineering to find the feature importance is even harder.

Benefits of Fraud Detector

The main benefit is it uses Amazon’s built-in online fraud expertise. It gives your fraud team more control to swap in and out detection models and versions on the fly. Based on Amazon’s rich history, this accelerates time to value built on your own data.

It has been designed to integrate with other AWS services. You can build from a template with data in S3. Then train your model using an automated pipeline. Your upload of data to S3 can trigger model retraining if necessary. And, of course, you can use your own custom models if you want to.

How do you use it?

As with many AWS high level ML services, there is a set workflow to follow, which has optional steps.

Source your data

The first dependency is that you have a reasonable amount of historical fraud data where you have identified (retrospectively in many cases) that an activity was dodgy. The current ‘Online Fraud Insights’ template expects a minimum 10,000 records and Amazon recommend 3-6 months of historic data.

example fraud detector dataset

Example fraud detector dataset

The example dataset includes this amount of generated fake data in order to try out the service. Note particularly the highlighted “fraud_label”, which is the output used in training. It can be any set of values but to keep it simple the test set defines “0” as legitimate data and “1” as the fraudulent label. These values also need to be configured in the model definition.

Create and train the model

Second, create your ML model, in the simple cases you can use the provided Amazon template. Or, if they don’t meet your needs, bring your own SageMaker model. Define the data source path in S3, the attributes to use as model inputs from columns headers. You also specify the IAM role to use and those all-important labels.

Enter fraud label values

Enter fraud label values

Tune your model

Train the model and tune to your desired level of detection accuracy. This is a really critical decision point; the trade-off between impacting all (legitimate) customers to prevent the minority who conduct fraudulent activity.

Define and train the model

Define and train the model

In the Models page you see a list of all your models and have to click on a specific one to see its performance and status. One fairly minor, but useful, update would be to see the table with all these attributes in case your number of models grows. This model overview table looks set to be enhanced to do just that given that the heading has the sort icons, which would be great to sort by Performance (AUC) for example.

Manage your fraud detector models

Manage your fraud detector models

Deploy your model version

Once you’re happy with the performance against test data, which should probably be higher than ~90%, the model should be deployed. This is done as a “version”.  It wasn’t immediately obvious, but to get more detailed metrics for your trained model click on the version (e.g. 1.0) on the left. This link displays charts and tables, including the confusion matrix.

Check the metrics

This report also includes a simple guide to “tune’ the model to minimise false negatives (i.e. it missed the fraud) whilst also attempting to minimise false positives (affecting your genuine users). This is why the overall metric displayed as Area Under Curve (AUC) links True Positive Rate (TPR) to False Positive Rate (FPR). As mentioned above, this is the crux of fraud detection and prevention and should ultimately be a business decision what is acceptable.

Machine Learning Metrics

Machine Learning Metrics

The “Table” tab also gives a handy guide to how setting the False positives rate too low would affect the true positives. Also, on this page, the “Actions” menu in the top right is where you can Deploy your model. Once you select “Deploy model version” the status changes to “Deploying” and, after a few minutes it is deployed and moves to “Active” state.

Define a detector or two

Once the model version is Active, you have to define a detector so that the GetPrediction API can be called through an endpoint. Your detectors reference the deployed model and version and then configure actual outcome behaviours when fraud is suspected. You do this by defining decision rules, which can be based on the raw data or the outputs of the training, known as “insights”. Each rule, when triggered, invokes one or more “outcomes”.

Configure Rules for Execution

Configure Rules for Execution

You can create new outcomes or have multiple rules lead to the same outcome (e.g. reject the account). It’s a nice feature that rules can lead to more than one outcome but could potentially get into a confusing mess if outcomes themselves need to change. So, my recommendation would be to draw up a simple decision tree of this level of rules. Maybe the interface could add one in a bit like the graphical UI for Step functions.

Test the model with example data

Once all this is done, your applications must submit events generated online to get back a prediction of potential fraud. Although you can define many detectors, only one can be ‘Active’ at a time. You can have other versions in ‘Draft’ or retain older detectors in an ‘Inactive’ state within the service. I found the versioning of models, detectors, rules very powerful but also could see how someone could lose track of all this versioning. When you execute the detector API you can either get back the first matched rule or all matched.

A simple test form is incorporated in the service for the fundamental attributes along with the outcome, which is currently only a string until we develop an actual workflow behind it.

Run Tests within the Service

Run Tests within the Service

Considerations for Live Service

The inferences need to be returned with low latency to not disrupt online users. Typical models return within 50 milliseconds on average. As you would expect, the service auto-scales under higher demand. Check regularly the relevance of rules and the freshness of the data and retrain if necessary. Treat online fraud detection as part of a wider security strategy involving layers. Potentially link fraud detection rates to WAF rules might be an example of how strength in depth can really give your website some quality protection.

Practical Considerations

At the moment it’s only available in the us-east-1 region as a Preview service so unsuitable for any non-US applications. Once GA I would expect it to launch and be compliant with GDPR in Europe and the UK.

The User Guide lists a number of conditions of the data that should be met. It handles missing data but if more than 20% nulls in any required column it issues a warning (90% in optional columns). Warnings are also shown if an optional field contains a single value.

The model will fail if there are fewer than 10,000 rows or less than 500 are labelled as positive for fraud. Timestamps are also pretty crucial so the model fails if more than 0.1% of values in “Event Timestamp” are invalid or null.

As a “Preview” service, Fraud Detector feels pretty well tested. I briefly checked out the API Reference and it seems pretty comprehensive and performant. I would recommend most customers to have a look at what it can offer as another layer in the fight against cyber-crime.


As a “Preview” service, Fraud Detector feels well tested. I had a few issues with data entries not saving correctly first time and had to refresh the screen and start again. Most of my gripes are minor usability quirks which I am sure AWS will iron out before GA release. I focused on using the console but did briefly check out the API Reference and it seems comprehensive and performant. The minor quibble with the API was how to manage versions as there didn’t seem to be a method to list detector and rule versions and to get the model versions required a specific API call.

I strongly recommend customers with online registrations that have experienced fraudulent activities to consider adding this service. The main criteria are to have enough data from established and growing userbases. The service can quickly offer another layer of security in the fight against cyber-crime, which can evolve and grow with your business. If you would like some help with this let us know.