FactSet Ownership - Aggregated Insider Transactions Overview


FactSet Ownership - Aggregated Insider Transactions is a dataset that summarizes insider transactions of publicly traded companies. Insider transactions are trades by corporate insiders that are reported to the SEC.

In this dataset, insider transactions for each publicly traded US company are summarized into daily fields that you can use in your algorithms.

This notebook serves as an introduction to using the data in the Research environment and in algorithms. See here for the full documentation.


Let's start by just pulling in some data. We import the Pipeline helpers and the two insider transactions DataSetFamilys, Form3AggregatedTrades and Form4and5AggregatedTrades:

In [1]:
from quantopian.pipeline import Pipeline
from quantopian.research import run_pipeline
from quantopian.pipeline.domain import US_EQUITIES

from import Form3AggregatedTrades
from import Form4and5AggregatedTrades

Insider transactions data is split up into two DataSetFamilys:

  • Form3AggregatedTrades contains Form 3 trades. Form 3 is filed by insiders when they are declaring an initial holding.
  • Form4and5AggregatedTrades contains Form 4 and 5 trades. Forms 4 and 5 are filed by insiders when they are declaring a change in ownership, like buying new shares or selling some of their shares.

For more details on the differences between each form, see this page from the SEC.

In this example, let's pull in data from both forms. To do this, we use the slice method, which breaks down DataSetFamilys into a DataSets. Two arguments must be passed each time you slice insider transactions:

  • The first argument is derivative_holdings. This selects for non-derivative or derivative holdings. Non-derivative shares are direct shares of the company's own stock, and derivative shares are a derivative of the company's stock (like a restricted stock unit or an option involving the company). derivative_holdings can be False for non-derivative shares or True for derivative shares. Let's use False in this case, because we only care about direct ownership.
  • The second argument is days_aggregated. This selects a trailing number of calendar days for which we aggregate data over. If days_aggregated is 7, for example, then all transactions from the trailing 7 calendar days are taken into account. This can be 1, 7, 30, or 90. Let's set 90 here so we aggregate transactions from the trailing 90 calendar days.
In [2]:
insider_txns_form3_90d = Form3AggregatedTrades.slice(derivative_holdings=False, days_aggregated=90)
insider_txns_form4and5_90d = Form4and5AggregatedTrades.slice(derivative_holdings=False, days_aggregated=90)

Now that we have our DataSets defined, all we have to do is run a Pipeline to pull out data.

In Form3AggregatedTrades, the column available is num_unique_filers, which describe the number of unique insiders that filed a new trade within our aggregation window (days_aggregated).

In Form4and5AggregatedTrades, the columns available are num_unique_buyers and num_unique_sellers, which respectively describe the number of unique insiders that have executed a buy or sell within our aggregation window.

In [3]:
pipe = Pipeline(
        'unique_filers_form3_90d': insider_txns_form3_90d.num_unique_filers.latest,
        'unique_buyers_form4and5_90d': insider_txns_form4and5_90d.num_unique_buyers.latest,
        'unique_sellers_form4and5_90d': insider_txns_form4and5_90d.num_unique_sellers.latest,

# Run the pipeline over a two year period
df = run_pipeline(pipe, '2015-10-01', '2017-10-01')

Pipeline Execution Time: 9.03 Seconds
unique_buyers_form4and5_90d unique_filers_form3_90d unique_sellers_form4and5_90d
2015-10-01 00:00:00+00:00 Equity(2 [ARNC]) 0.0 0.0 0.0
Equity(21 [AAME]) 1.0 0.0 0.0
Equity(24 [AAPL]) 7.0 0.0 7.0
Equity(25 [ARNC_PR]) NaN NaN NaN
Equity(31 [ABAX]) 1.0 0.0 1.0

We have now loaded insider transactions data into a DataFrame that we can use to research trading signals and inform trades.

Let's extend this a little bit. Insider transactions data can be used creatively in a bunch of ways; one potential use could be in signaling an upcoming acquisition or corporate action. Insiders who know that there is an upcoming corporate action might go quiet to avoid accusations of insider trading, so we may be able to use a decrease in insider transactions to point to stocks that are on the cusp of announcing a big acquisition, spinoff, etc.

As a simple example, let's take the case of Microsoft acquiring LinkedIn in 2016 for $26 billion, one of the largest acquisitions of the decade. Did Microsoft insiders slow down their trading prior to the acquisition announcement?

In [4]:
import matplotlib.pyplot as plt
import pandas as pd

# Create a DataFrame just containing Microsoft activity
msft_insider_txns = df[df.index.get_level_values(1) == symbols('MSFT')].reset_index(1, drop=True)

# Plot the count of unique insiders buying or selling on any form
msft_insider_txns.sum(axis=1).plot(title='Count of Microsoft insiders trading before LinkedIn acquisition')

plt.ylabel('Count of Microsoft insiders trading')

# Compare this to the LinkedIn acquisition on June 13th, 2016
linkedin_acquisition_announcement_date = pd.Timestamp('2016-06-13')
plt.axvline(x=linkedin_acquisition_announcement_date, color='red')
plt.text(linkedin_acquisition_announcement_date + pd.Timedelta(days=5), 29,
         'Microsoft announces LinkedIn acquisition', color='red');

In this case, it looks like executives did slow down their trading prior to the acquisition announcement. This is just a simple example and we won't always see results like this, but something like this could be extended into a trading signal.

Usage with a CustomFactor

We can also use Aggregated Insider Transactions data in a Pipeline CustomFactor. Let's explore an example that uses a CustomFactor to measure the week-over-week differences in Form 3 filers among all companies.

Let's first define our CustomFactor. We are using window_length of 5 here, which tells our WeekOverWeekDifference factor to take in values for the past 5 market days. We output the latest value, and subtract the value 5 market days ago from that. This gives us the difference over the past 5 market days, or past week (unless there are market holidays).

You can read more about CustomFactors here.

In [5]:
from quantopian.pipeline import CustomFactor

class WeekOverWeekDifference(CustomFactor):
    window_length = 5
    def compute(self, today, assets, out, value):
        out[:] = value[-1] - value[0]

Next, let's use our CustomFactor in a Pipeline. Let's load in the Form 3 insider transactions data, and use derivative holdings for this example. We'll set days_aggregated to 7 to give us the unique filers over the past 7 calendar days (Note: this is different from the 5 market days we used in our CustomFactor above; see the below section for details on this difference.)

We pass the num_unique_filers field to WeekOverWeekDifference, which will place the difference in our Pipeline output.

In [6]:
form3_derivative_7d = Form3AggregatedTrades.slice(derivative_holdings=True, days_aggregated=7)

form3_filers_week_over_week_diff = WeekOverWeekDifference(inputs=[form3_derivative_7d.num_unique_filers])

pipe = Pipeline(
        'form3_filers_week_over_week_diff': form3_filers_week_over_week_diff
    screen=(form3_filers_week_over_week_diff.notnull() & (form3_filers_week_over_week_diff != 0))

# Run the pipeline over a year
df_week_over_week = run_pipeline(pipe, '2017-01-01', '2017-06-01')

Pipeline Execution Time: 0.09 Seconds
2017-01-03 00:00:00+00:00 Equity(2696 [FAST]) -2.0
Equity(5040 [MRVC]) -1.0
Equity(7779 [UMH]) 1.0
Equity(16389 [NCR]) -2.0
Equity(39526 [MITL]) -16.0

Every trading day, Pipeline uses Aggregated Insider Transactions to count the unique Form 3 filers for each stock aggregated over the past 7 calendar days. It then looks back 5 trading days (which is typically one week) from the current date, and counts the unique Form 3 filers for each stock aggregated over the past 7 calendar days from that date. Using our CustomFactor, Pipeline then takes the difference of these two aggregations, which gives us the week-over-week change in the number of filers.

Note how, in our output, we have filtered for only non-zero and non-null differences using the screen argument.

Calendar days vs. trading days

The days_aggregated field we slice on is measured in calendar days, while most other Quantopian datasets consider trading days. Calendar days include weekends and holidays, while trading days only include days the market is open.

To highlight the difference in this dataset, consider a case where we are looking back from a Monday with days_aggregated set to 1:

In [7]:
# An example slice with days_aggregated set to 1.
insider_txns_form3_1d = Form3AggregatedTrades.slice(False, 1)

pipe = Pipeline(
        'filers_form3_1d': insider_txns_form3_1d.num_unique_filers.latest,

# Run the Pipeline on a random Monday
df = run_pipeline(pipe, '2016-01-04', '2016-01-04')

# Check for any nonzero entries

Pipeline Execution Time: 0.05 Seconds
2016-01-04 00:00:00+00:00 Equity(2 [ARNC]) 0.0
Equity(21 [AAME]) 0.0
Equity(24 [AAPL]) 0.0
Equity(31 [ABAX]) 0.0
Equity(41 [ARCB]) 0.0

We are running our Pipeline on a single day here, which is a Monday. Since days_aggregated is set to 1 and defines a lookback in calendar days, Pipeline will only consider insider transactions from the previous calendar day, which is Sunday in this case. Even though Friday was the previous trading day, Friday's data is ignored, as it is not the previous calendar day. We can see that the result does not contain any nonzero entries, which makes sense given that it's rare a transaction takes places on a Sunday and gets filed and immediately included in the dataset.

If we run this Pipeline on the next day (Tuesday), we would see insider transactions from Monday, as that is the previous calendar day.

In [ ]: