At A Glance: This article will give you an overview of APK tracking, including its methods and challenges.

Overview

APK tracking is vital for understanding how users discover and install your application. This article outlines two primary methods of APK tracking—Probabilistic tracking and APK tracking—along with their associated challenges.

APK tracking involves attributing app installs to specific marketing channels or sources. This process helps developers and marketers evaluate the effectiveness of their campaigns and optimize user acquisition strategies.

Methods of APK Tracking

APK tracking methods help attribute app installs to specific marketing campaigns, especially in cases where apps are distributed through APK files outside of traditional app stores like Google Play. Here’s a detailed explanation of two APK tracking methods: the Probabilistic Method and the APK Referrer Method. If you need any help in implementing APK Tracking you can follow the steps mentioned here.

1. Probabilistic Method

This method is often used when deterministic identifiers (like device IDs) are unavailable or restricted due to privacy regulations. It involves statistical modelling based on device and session-level information to attribute installs and in-app actions.

How it works

When a user clicks on an ad or marketing link, several anonymized data points are captured, including:

IP Address: Provides geographic location.
Device Type: Brand, model, and OS version.
Browser/Device Fingerprint: Data about the user’s browser, screen resolution, and other attributes that help distinguish users.
Timestamp: The time the ad click occurred.
Matching Algorithm: When the app is installed and opened, the same set of data points is collected. The tracking system then uses probabilistic algorithms to match the click data with the install data based on how closely these attributes align (e.g., same IP, similar time frame, same device model).

Attribution Rate

The system assigns the install to the campaign that generated the click with the highest likelihood, based on statistical matching. There’s no guarantee of 100% accuracy, but it provides an estimate based on available data.

Correct Attribution Rate: 85-90%
Example: Out of 10 non-organic installs attributed to marketing partners, some may incorrectly be categorized as organic installs.

2. APK Referrer Method

This method is more deterministic and involves passing information through the install referrer API (a feature supported by many third-party app stores and Android’s Install Referrer API).

How it works

Install Referrer Tracking: When an APK is downloaded and installed, referrer data (such as utm_campaign, utm_source, and other campaign information) is passed along with the APK. This referrer information is often embedded in the download URL or the marketing link.
Referrer Read During Installation: When the user installs and opens the APK, the app collects the referrer information, which can be retrieved using the INSTALL_REFERRER intent from Android's Install Referrer API. This provides the app with details about the marketing campaign or the source of the install.
Deterministic Attribution: Since this method relies on the actual referral link data that’s tied to the APK installation, it provides deterministic attribution, meaning the exact campaign that led to the install can be identified with certainty.

Reasons for Higher Non-Organic Attribution

Absence of Click Referrer: Many clients store their apps on their own servers, leading to a lack of direct user click referrer data from app stores.
Reliance on Models: Attribution relies heavily on statistical models rather than direct user interaction data.

Issues with APK Tracking

1. Attribution Challenges

Overlapping Categories: Some installs may fall into both organic and non-organic categories, complicating the attribution process.
Data Gaps: The lack of direct referrer data makes it difficult to accurately track the source of installs.

2. Model Limitations

Reliability of Models: The accuracy of the probabilistic models can vary, leading to potential misattribution.
Changing User Behavior: Shifts in user behaviour or market trends can affect the performance of tracking models over time.

Challenges in APK Tracking

Despite these tracking methods, several challenges arise:

1) Attribution Challenges

Overlapping Categories: Some installs may be mistakenly classified as both organic and non-organic.
Data Gaps: Missing or incomplete referrer data can affect tracking accuracy.

2) Model Limitations

Reliability Issues: Probabilistic models can produce incorrect attributions, leading to inefficiencies.
Changing User Behavior: Market trends and privacy updates (like Apple’s ATT framework) can impact tracking models.

Deterministic vs. Probabilistic Tracking Models

Probabilistic and deterministic tracking models are two approaches used in fields like robotics, computer vision, and navigation to estimate the position of an object over time. Here’s a breakdown of the key differences:

Deterministic Tracking Models

Deterministic tracking relies on unique identifiers (like device IDs or user login information) to directly match user actions across different platforms. When a user interacts with an ad and later performs an action (e.g., installs an app), deterministic tracking can precisely identify that the user is the same based on these identifiers.

Accuracy: It's highly accurate since it's based on exact user data, ensuring minimal ambiguity.
Common identifiers: Device ID, User ID, Ad ID (like Apple's IDFA or Google’s GAID), or login information.
Use cases: Ideal for scenarios where unique identifiers are available, such as in-app advertising or when users are logged in.

Probabilistic Tracking

Probabilistic tracking relies on statistical models and algorithms to attribute user actions based on less precise data. It uses non-personal data like device type, IP address, location, or time stamps to estimate the likelihood that a particular action belongs to a specific user.

Accuracy: It’s less accurate than deterministic tracking but can still be quite reliable. However, it may lead to false positives or negatives since it is based on probabilities rather than exact matches.
Use cases: Useful when deterministic identifiers are not available, such as in environments where privacy regulations restrict access to device IDs (like Apple's ATT framework).

Key Differences

Factor	Deterministic Tracking	Probabilistic Tracking
Data Use	Uses unique identifiers	Uses anonymized data points
Accuracy	Highly precise	Estimated attribution
Privacy	Requires user consent	More privacy-friendly

Both models are often used together to create a comprehensive attribution strategy, with deterministic tracking preferred when identifiers are available and probabilistic tracking used as a fallback.

Recommendations

Regularly Update Models: Continuously refine attribution models to improve accuracy.
Monitor User Behavior: Keep track of changes in user behaviour to adapt tracking strategies effectively.
Integrate Data Sources: Where possible, integrate additional data sources to enhance attribution accuracy.

Broaden your knowledge:

We are delighted to have assembled a world-class team of experienced professionals who are ready to take care of your queries and answer any questions you may have.
Feel free to reach out to us at any time by emailing us at support@apptrove.com or by using the in-platform chat feature. We'd love to hear from you!