Notebook

# Investing in Women-Led Companies

It has been widely reported that companies with women in senior management and on the board of directors perform better than companies without. Credit Suisse’s Gender 3000 report looks at gender diversity in 3000 companies across 40 countries. According to this report, at the end of 2013, women accounted for 12.9% of top management (CEOs and directors reporting to the CEO) and 12.7% of boards had gender diversity. Additionally, “Companies with more than one woman on the board have returned a compound 3.7% a year over those that have none since 2005.”

These kind of reports quickly lead to the question, “What would happen if you invested in companies with female CEOs?”

## The Data¶

The data backing this research was provided by Catalyst’s (http://www.catalyst.org/) Bottom Line Research Project (http://www.catalyst.org/knowledge/bottom-line-0).

In [24]:
#Import the libraries needed for the analysis.

import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import matplotlib.pyplot as pyplot
import pytz
from pytz import timezone
from zipline.api import (order_target_percent, record, symbol, history, add_history, get_datetime,
get_open_orders, get_order, order_target_value, order, order_target, sid)
from zipline.finance.slippage import FixedSlippage

#Import my csv and rename some of the columns
CEOs = local_csv('FemaleCEOs_v6.csv')
CEOs.rename(columns={'SID':'Ticker', 'Start Date':'start_date', 'End Date':'end_date'}, inplace=True)

#Below you see some basic information and the first 10 rows of this dataframe.
print "Number of CEOs = %s" % len(CEOs)
print "Number of Companies = %s" % CEOs['Ticker'].nunique()
CEOs[0:20]

Number of CEOs = 80
Number of Companies = 74

Out[24]:
CEO Company Name Ticker start_date end_date
0 Patricia A Woertz Archer Daniels Midland Company (ADM) ADM 5/1/06 12/31/14
1 Patricia Russo Lucent ALU 12/1/06 8/1/08
2 Katherine Krill AnnTaylor Stores Corporation ANN 10/1/05 12/31/14
3 Angela Braly WellPoint ANTM 6/1/07 8/1/12
4 Judy McReynolds Arkansas Best Corp. ARCB 1/2/10 12/31/14
5 Andrea Jung Avon Product AVP 11/4/99 4/22/12
6 Sheri McCoy Avon Product AVP 4/23/12 12/31/14
7 Susan N. Story American Water Works Company AWK 5/9/14 12/31/14
8 Gayla Delly Benchmark Electronics BHE 1/3/12 12/31/14
9 Elizabeth Smith Bloomin' Brands BLMN 8/9/12 12/31/14
10 Diane M. Sullivan Brown Shoe Company CAL 5/26/11 12/31/14
11 Sandra Cochran Cracker Barrel Old Country Store CBRL 9/12/11 12/31/14
12 Linda Massman Clearwater Paper CLW 1/2/13 12/31/14
13 Denise Morrison Campbell Soup CPB 8/1/11 12/31/14
14 Andrea Ayers Convergys CVG 10/2/12 12/31/14
15 Ellen Kullman DuPont DD 1/2/09 12/31/14
16 Sara Mathew Dun & Bradstreet Inc. DNB 1/2/10 10/7/13
17 Lynn Good Duke Energy DUK 7/1/13 12/31/14
18 Margaret Whitman eBay EBAY 1/1/98 3/1/08
19 Mary Agnes Wilderotter Citizens Communications FTR 11/1/04 12/31/14

## How many new CEOs per year?¶

In [25]:
CEOs['year_started'] = pd.DatetimeIndex(CEOs['start_date']).year
CEOs['year_ended'] = pd.DatetimeIndex(CEOs['end_date']).year

CEOs['year_started'].value_counts(sort=False).plot(kind='bar')

pyplot.grid(b=None, which='major', axis='both')
pyplot.box(on=None)


## Scrubbing the Data¶

In [26]:
# First I need to convert the date values in the csv to datetime objects in UTC timezone.

CEOs['start_date'] = CEOs['start_date'].apply(lambda row: pd.to_datetime(str(row), utc=True))
CEOs['end_date'] = CEOs['end_date'].apply(lambda row: pd.to_datetime(str(row), utc=True))

In [27]:
# Then I want to check if any of the dates are weekends.
# If they are a weekend, I move them to the following Monday.

def check_date(row):
week_day = row.isoweekday()
if week_day == 6:
row = row + timedelta(days=2)
elif week_day == 7:
row = row + timedelta(days=1)
return row

CEOs['start_date'] = CEOs['start_date'].apply(check_date)
CEOs['end_date'] = CEOs['end_date'].apply(check_date)

In [28]:
# We need to deal with the dates that are outside of our pricing data range
# For people that started prior to 01/02/2002, I have changes their start date to 01/02/2002
# I also changed any future dated end dates to 12/1/2014, just to be safe.

def change_date(row):
start_date = row['start_date']
end_date = row['end_date']

if start_date < pd.to_datetime("2002-01-02", utc=True):
row['start_date'] = pd.to_datetime("2002-01-02", utc=True)
elif end_date > pd.to_datetime("2015-01-01", utc=True):
row['end_date'] = pd.to_datetime("2014-12-01", utc=True)
return row

CEOs = CEOs.apply(change_date, axis=1)

In [29]:
# I then add a new row called SID, which is the Security Identifier.
# Since ticker symboles are not unique across all time, the SID ensures we have the right company.
# I use the ticker and the start date to search for the security object

def get_SID(row):
temp_ticker = row['Ticker']
start_date = row['start_date'].tz_localize('UTC')

row['SID'] = symbols(temp_ticker, start_date)
return row

CEOs = CEOs.apply(get_SID, axis=1)
CEOs.sort(columns='start_date')

Out[29]:
CEO Company Name Ticker start_date end_date year_started year_ended SID
21 Paula G. Rosput AGL Resources Inc. GAS 2002-01-02 2006-01-03 2000 2006 Equity(595 [GAS])
76 Carleton Fiorina HP HP 2002-01-02 2005-02-08 1999 2005 Equity(3647 [HP])
18 Margaret Whitman eBay EBAY 2002-01-02 2008-03-03 1998 2008 Equity(24819 [EBAY])
5 Andrea Jung Avon Product AVP 2002-01-02 2012-04-23 1999 2012 Equity(660 [AVP])
69 Anne Mulcahy Xerox Corp XRX 2002-01-02 2009-07-01 2001 2009 Equity(8354 [XRX])
65 Debra Cafaro Ventas VTR 2002-01-02 2014-12-31 1999 2014 Equity(18821 [VTR])
58 Cinda Hallman Spherion SFN 2002-01-02 2004-04-01 2001 2004 Equity(21809 [SFN])
24 S. Marce Fuller Genon Energy GEN 2002-01-02 2005-09-30 1999 2005 Equity(22720 [GEN])
49 Pamela Kirby Quintiles Transnational Q 2002-01-02 2003-09-25 2001 2003 Equity(17104 [Q])
79 Dorrit J Bern Charming Shoppes CHRS 2002-02-04 2008-07-09 2002 2008 Equity(1524 [CHRS])
74 Mary Forte Zale ZLC 2002-08-01 2006-01-30 2002 2006 Equity(10069 [ZLC])
43 Patricia Gallup PC Connection PCCC 2002-09-03 2011-08-08 2002 2011 Equity(18471 [PCCC])
78 Stephanie Streeter Banta BN 2002-10-01 2007-01-03 2002 2007 Equity(1012 [BN])
47 Dona Davis Young Phoenix Companies PNX 2003-01-02 2009-04-15 2003 2009 Equity(22832 [PNX])
50 Mary Sammons Rite Aid Corp RAD 2003-06-25 2010-06-23 2003 2010 Equity(6330 [RAD])
19 Mary Agnes Wilderotter Citizens Communications FTR 2004-11-01 2014-12-31 2004 2014 Equity(2069 [FTR])
41 Janet L. Robinson The New York Times Company NYT 2004-12-27 2012-01-03 2004 2012 Equity(5551 [NYT])
37 Patricia Kampling Alliant Energy LNT 2005-04-01 2014-12-31 2005 2014 Equity(18584 [LNT])
51 Susan Ivey Reynolds American RAI 2005-06-27 2011-02-01 2005 2011 Equity(20277 [RAI])
2 Katherine Krill AnnTaylor Stores Corporation ANN 2005-10-03 2014-12-31 2005 2014 Equity(430 [ANN])
34 Linda A. Lang Jack in the Box JACK 2005-10-03 2014-01-02 2005 2014 Equity(20740 [JACK])
20 Mary Agnes Wilderotter Frontier Communications FTR 2006-01-03 2014-12-31 2006 2014 Equity(2069 [FTR])
77 Paula Rosput Reynolds Safeco SAF 2006-01-03 2008-09-02 2006 2008 Equity(6622 [SAF])
75 Mary Burton Zale ZLC 2006-01-30 2007-12-03 2006 2007 Equity(10069 [ZLC])
56 Claire Babrowski Radio Shack RSH 2006-02-01 2006-07-05 2006 2006 Equity(21550 [RSH])
48 Peggy Y. Fowler Portland General Electric POR 2006-04-13 2009-03-02 2006 2009 Equity(28318 [POR])
27 Constance H. Lau Hawaiian Electric Industries Inc. HE 2006-05-02 2014-12-31 2006 2014 Equity(3509 [HE])
38 Irene Rosenfeld Mondelez International MDLZ 2006-06-26 2014-12-31 2006 2014 Equity(22802 [MDLZ])
44 Indra Nooyi PepsiCo PEP 2006-10-02 2014-12-31 2006 2014 Equity(5885 [PEP])
... ... ... ... ... ... ... ... ...
53 Mary Berner Reader's Digest Association RDA 2010-11-10 2011-04-25 2010 2011 Equity(40397 [RDA])
35 Beth E. Mooney KeyCorp KEY 2011-05-02 2014-12-31 2011 2014 Equity(4221 [KEY])
10 Diane M. Sullivan Brown Shoe Company CAL 2011-05-26 2014-12-31 2011 2014 Equity(1195 [CAL])
59 Debra Reed Sempra Energy SRE 2011-06-27 2014-12-31 2011 2014 Equity(24778 [SRE])
13 Denise Morrison Campbell Soup CPB 2011-08-01 2014-12-31 2011 2014 Equity(1795 [CPB])
11 Sandra Cochran Cracker Barrel Old Country Store CBRL 2011-09-12 2014-12-31 2011 2014 Equity(1308 [CBRL])
28 Meg Whitman HP HPQ 2011-09-22 2014-12-31 2011 2014 Equity(3735 [HPQ])
33 Denise Ramos ITT ITT 2011-10-03 2014-12-31 2011 2014 Equity(14081 [ITT])
22 Gracia Martore Tegna TGNA 2011-10-06 2014-12-31 2011 2014 Equity(3128 [TGNA])
71 Gretchen McClain Xylem XYL 2011-10-24 2013-09-09 2011 2013 Equity(42023 [XYL])
30 Virgina Rometty IBM IBM 2012-01-03 2014-12-31 2012 2014 Equity(3766 [IBM])
8 Gayla Delly Benchmark Electronics BHE 2012-01-03 2014-12-31 2012 2014 Equity(856 [BHE])
39 Heather Bresch Mylan MYL 2012-01-03 2014-12-31 2012 2014 Equity(5166 [MYL])
6 Sheri McCoy Avon Product AVP 2012-04-23 2014-12-31 2012 2014 Equity(660 [AVP])
73 Marissa Mayer Yahoo YHOO 2012-07-17 2014-12-31 2012 2014 Equity(14848 [YHOO])
9 Elizabeth Smith Bloomin' Brands BLMN 2012-08-09 2014-12-31 2012 2014 Equity(43283 [BLMN])
14 Andrea Ayers Convergys CVG 2012-10-02 2014-12-31 2012 2014 Equity(19203 [CVG])
40 Wellington J. Denahan-Norris Annaly Capital Management NLY 2012-11-05 2014-12-31 2012 2014 Equity(17702 [NLY])
23 Phebe Novakovic General Dynamics GD 2013-01-02 2014-12-31 2013 2014 Equity(3136 [GD])
36 Marillyn Hewson Lockheed Martin LMT 2013-01-02 2014-12-31 2013 2014 Equity(12691 [LMT])
64 Kimberly Bowers CST Brands VLO 2013-01-02 2014-12-31 2013 2014 Equity(7990 [VLO])
12 Linda Massman Clearwater Paper CLW 2013-01-02 2014-12-31 2013 2014 Equity(37775 [CLW])
62 Sheryl Palmer Taylor Morrison Home TMHC 2013-04-22 2014-12-31 2013 2014 Equity(44433 [TMHC])
17 Lynn Good Duke Energy DUK 2013-07-01 2014-12-31 2013 2014 Equity(2351 [DUK])
63 Mary Dillon Ulta Salon Cosmetics & Fragrance ULTA 2013-07-01 2014-12-31 2013 2014 Equity(34953 [ULTA])
26 Lauralee Martin HCP HCP 2013-10-03 2014-12-31 2013 2014 Equity(3490 [HCP])
25 Mary Barra GM GM 2014-01-15 2014-12-31 2014 2014 Equity(40430 [GM])
52 Susan Cameron Reynolds American RAI 2014-05-01 2014-12-31 2014 2014 Equity(20277 [RAI])
7 Susan N. Story American Water Works Company AWK 2014-05-09 2014-12-31 2014 2014 Equity(36098 [AWK])
55 Barbara Rentler Ross Stores ROST 2014-06-02 2014-12-31 2014 2014 Equity(6546 [ROST])

80 rows × 8 columns

In [30]:
# I set the start and end date I want my algo to run for
start_algo = '2002-01-01'
end_algo = '2014-12-31'

# I make a series out of just the SIDs.
SIDs = CEOs.SID

# Then call get_pricing on the series of SIDs and store the results in a new dataframe called prices.
data = get_pricing(
SIDs,
start_date= start_algo,
end_date= end_algo,
fields ='close_price',
handle_missing='ignore'
)


# My Algo - The Concept

The algo as written below, buys when the CEO comes into the postion and sells when she leave. It rebalances based on the number of stocks in my portfolio. When I own one stock, it will be 100% of my portfolio. When I own two stocks, they will each be 50% of my portfolio. As the number of stocks in my portfolio changes, the target weight of each stock should change too.

A future change would be to reinvest this more daily, so that the dividends are taken into consideration when they applied, and not the next time I buy or sell a stock. In this way, the algo is actually under performing where it should be.

## How many companies will I hold each year?¶

In [31]:
from pandas.tseries.offsets import YearBegin
CEOs['year_ended'] = pd.DatetimeIndex(CEOs['end_date']).year
CEOs['year_started'] = pd.DatetimeIndex(CEOs['start_date']).year

counts = pd.Series(index=pd.date_range('2002-01-01', '2015-01-01', freq=YearBegin(1)))
for year in counts.index:
counts[year] = len(CEOs[(CEOs.start_date <= year) & (CEOs.end_date >= year)])

counts.plot(kind = 'bar')

pyplot.grid(b=None, which='major', axis='both')
pyplot.box(on=None)


## My Algo¶

The algo as written below, buys when the CEO comes into the postion and sells when she leave. It rebalances based on the number of stocks in my portfolio. When I own one stock, it will be 100% of my portfolio. When I own two stocks, they will each be 50% of my portfolio. As the number of stocks in my portfolio changes, the target weight of each stock should change too.

A future change would be to reinvest this monthly, so that the dividends are taken into consideration when they applied, and not the next time I buy or sell a stock. In this way, the algo is actually under performing where it should be.

In [32]:
"""
This is where I initialize my algorithm

"""

from zipline.api import order
from zipline.finance.slippage import FixedSlippage

def initialize(context):
#load the CEO data and a variable to count the number of stocks held at any time as global variables

context.CEOs = CEOs
context.current_stocks = []
context.stocks_to_order_today = []
context.stocks_to_sell_today = []


In [33]:
"""
Handle data is the function that is running every minute (or day) looking to make trades
"""
from zipline.api import order

def handle_data(context, data):
#: Set my order and sell dictionaries to empty at the start of any day.
context.stocks_to_order_today = []
context.stocks_to_sell_today = []

# Get todays date.
today = get_datetime()

# Get a dataframe with just the companies where start_date (or end date) is today.
context.stocks_to_order_today = context.CEOs.SID[context.CEOs.start_date==today].tolist()
context.stocks_to_sell_today= context.CEOs.SID[context.CEOs.end_date==today].tolist()
context.stocks_to_sell_today = [s for s in context.stocks_to_sell_today if s!= None]
context.stocks_to_order_today = [s for s in context.stocks_to_order_today if s!= None]

# If there are stocks that need to be bought or sold today
if len(context.stocks_to_order_today) > 0 or len(context.stocks_to_sell_today) > 0:

# First sell any that need to be sold, and remove them from current_stocks.
for stock in context.stocks_to_sell_today:
if stock in data:
if stock in context.current_stocks:
order_target(stock,0)
context.current_stocks.remove(stock)
#print "Selling %s" % stock

for stock in context.stocks_to_order_today:
context.current_stocks.append(stock)

# Then rebalance the portfolio so I have and equal amount of each stock in current_stocks.
for stock in context.current_stocks:
if stock in data:
#print "Buying and/or rebalancing %s at target weight %s" % (stock, target_weight)

portfolio_value = context.portfolio.portfolio_value
num_stocks = len(context.current_stocks)

#print "Buying and/or rebalancing %s at value = %s" % (stock, value_to_buy)


In [34]:
"""
This cell will create an extremely simple handle_data that will keep 100%
of our portfolio into the SPY and I'll plot against the algorithm defined above.
"""
# I set the start and end date I want my algo to run for
start_algo = '2002-01-01'
end_algo = '2014-12-31'

# I make a series out of just the SIDs.
SIDs = CEOs.SID

# Then call get_pricing on the series of SIDs and store the results in a new dataframe called prices.
data = get_pricing(
SIDs,
start_date= start_algo,
end_date= end_algo,
fields ='close_price',
handle_missing='ignore'
)

#: Here I'm defining the algo that I have above so I can run with a new graphing method
initialize=initialize,
handle_data=handle_data
)

#: Create a figure to plot on the same graph
fig = pyplot.figure()

#: Create our plotting algorithm
def my_algo_analyze(context, perf):
perf.portfolio_value.plot(ax = ax1, label="Fortune 1000 Women-Led Companies")

#: Insert our analyze methods
my_algo._analyze = my_algo_analyze

# Run algorithms
returns = my_algo.run(data)

#: Plot the graph
ax1.set_ylabel('portfolio value in $', fontsize=20) ax1.set_title("Cumulative Return", fontsize=20) ax1.legend(loc='best') fig.tight_layout() pyplot.show()  ## Benchmarks¶ To get a benchmark, I'm using a function, get_backtest, which pulls all of the results of a backtest in from the Quantopian IDE. In this case, my algorithm does nothing, other than set a benchmark. This allows me to get a benchmark where all the work has already been done to optimize the benchmark. In [35]: benchmark_bt = get_backtest('54ef94a65457f30f0b4db137')  100% Time: 0:00:05|###########################################################|  I plot the cumulative returns of this benchmark against those of my algo to see how the relative performance is. In [36]: #: Create a figure to plot on the same graph fig = pyplot.figure(figsize=(20,22)) ax1 = fig.add_subplot(211) #: Plot the graph cum_returns = pd.Series(my_algo.perf_tracker.cumulative_risk_metrics.algorithm_cumulative_returns[:len(benchmark_bt.risk.index)], index=benchmark_bt.risk.index) benchmark_bt.risk.benchmark_period_return.plot(ax=ax1) cum_returns.plot(ax=ax1) ax1.set_ylabel('% Cumulative Return', fontsize=20) ax1.set_title("Cumulative Return", fontsize=20) ax1.legend(["SPY", "Fortune 1000 Women-Led Companies"], loc='best') fig.tight_layout() pyplot.show()  ## Returns¶ I calculate the returns for the benchmark and my algo, and then to difference between them. In [37]: benchmark_bt.risk.benchmark_period_return.iloc[-1]  Out[37]: 1.22305952214 In [38]: bench_tot_return = benchmark_bt.risk.benchmark_period_return.iloc[-1] algo_tot_return = my_algo.perf_tracker.cumulative_risk_metrics.algorithm_cumulative_returns[-1] bench_pct_ret = bench_tot_return * 100 algo_pct_ret = algo_tot_return * 100 bench_algo_diff = (algo_tot_return - bench_tot_return) * 100 print "Algo Percent Returns %s" % algo_pct_ret print "Benchmark Percent Returns %s" % bench_pct_ret print "Difference %s" % bench_algo_diff  Algo Percent Returns 338.8528772 Benchmark Percent Returns 122.305952214 Difference 216.546924986  ## Leverage¶ In [39]: # I verify that my leverage is still in line. returns.gross_leverage.plot()  Out[39]: <matplotlib.axes._subplots.AxesSubplot at 0x7ffb7311be10> ## Sharpe¶ In [40]: # I also take a look at the Sharpe Ratio. Sharpe helps understand the volitility of a strategy. # Higher is better, and because my strategy looks more volitile than the S&P, it's worth considering. # This is the sharpe of my algo pct_change = returns['portfolio_value'].pct_change() sharpe = (pct_change.mean()*252)/(pct_change.std() * np.sqrt(252)) sharpe  Out[40]: 0.63048095824464745 In [41]: # This is the sharpe of the benchmark bench_pct_change = benchmark_bt.risk.benchmark_period_return.pct_change() bench_sharpe = (bench_pct_change.mean()*252)/(bench_pct_change.std() * np.sqrt(252)) bench_sharpe  Out[41]: 0.22512942045928613 This sharpe ratio isn't exceptional or anything, but it's good enough to be considered for the Quantopian Managers program (https://www.quantopian.com/managers,) and also beats out the S&P, so I am happy. ## Yahoo & Alibaba¶ A couple of people have asked, "What if you remove Yahoo and Alibaba? Is this all due to the incredible performance there? It's pretty easy to test that out. In [42]: security = 14848 #found this by hand 3647, 660, 14848, 8354, adm_df = CEOs[(CEOs['SID'] == security)]  In [43]: fig = pyplot.figure() ax2 = fig.add_subplot(212) start_date = adm_df['start_date'] end_date = adm_df['end_date'] data[security].plot(ax=ax2, figsize=(15, 18), color='g') ax2.plot(start_date, data.ix[start_date][security], '^', markersize=20, color='b', linestyle='') ax2.plot(end_date, data.ix[end_date][security], 'v', markersize=20, color='b', linestyle='') pyplot.ylabel('% Cumulative Return', fontsize=20) pyplot.title("Cumulative Return", fontsize=20) pyplot.grid(b=None, which='major', axis='both') pyplot.box(on=None) pyplot.legend(['Yahoo'], frameon=False, loc='best') print adm_df['CEO']  72 Carol Bartz 73 Marissa Mayer Name: CEO, dtype: object  In [44]: #Remove Yahoo CEOs_yhoo = CEOs[(CEOs['Ticker'] != ('YHOO'))]  In [45]: """ This cell will create an extremely simple handle_data that will keep 100% of our portfolio into the SPY and I'll plot against the algorithm defined above. """ # I set the start and end date I want my algo to run for start_algo = '2002-01-01' end_algo = '2014-12-31' # I make a series out of just the SIDs. SIDs = CEOs_yhoo.SID # Then call get_pricing on the series of SIDs and store the results in a new dataframe called prices. data = get_pricing( SIDs, start_date= start_algo, end_date= end_algo, fields ='close_price', handle_missing='ignore' ) #: Here I'm defining the algo that I have above so I can run with a new graphing method my_algo_yhoo = TradingAlgorithm( initialize=initialize, handle_data=handle_data ) #: Insert our analyze methods my_algo_yhoo._analyze = my_algo_analyze # Run algorithms returns_yhoo = my_algo_yhoo.run(data)  In [46]: pyplot.figure(figsize=[16,10]) benchmark_bt.risk.benchmark_period_return.plot() returns_yhoo.algorithm_period_return.plot() pyplot.ylabel('% Cumulative Return', fontsize=20) pyplot.title("Cumulative Return", fontsize=20) pyplot.grid(b=None, which='major', axis='both') pyplot.box(on=None) pyplot.legend(['SPY', 'Fortune 1000 Female CEOs'], frameon=False, loc='best')  Out[46]: <matplotlib.legend.Legend at 0x7ffb78e35490> In [47]: bench_tot_return = benchmark_bt.risk.benchmark_period_return[-1] algo_tot_return = my_algo_yhoo.perf_tracker.cumulative_risk_metrics.algorithm_cumulative_returns[-1] bench_pct_ret = bench_tot_return * 100 algo_pct_ret = algo_tot_return * 100 bench_algo_diff = (algo_tot_return - bench_tot_return) * 100 print "Algo Percent Returns %s" % algo_pct_ret print "Benchmark Percent Returns %s" % bench_pct_ret print "Difference %s" % bench_algo_diff  Algo Percent Returns 318.9465792 Benchmark Percent Returns 122.305952214 Difference 196.640626986  ## Remove the outliers¶ Someone else asked me to remove the top and bottom outliers. Here I remove the top 3 and the bottom 3. In [48]: # Remove the top 3 CEOs_outliers = CEOs[(CEOs['Ticker'] != ('HSNI'))] CEOs_outliers = CEOs_outliers[(CEOs_outliers['Ticker'] != ('VTR'))] CEOs_outliers = CEOs_outliers[(CEOs_outliers['Ticker'] != ('TJX'))] # Remove the bottom 3 CEOs_outliers = CEOs_outliers[(CEOs_outliers['Ticker'] != ('NYT'))] CEOs_outliers = CEOs_outliers[(CEOs_outliers['Ticker'] != ('RAD'))] CEOs_outliers = CEOs_outliers[(CEOs_outliers['Ticker'] != ('Q'))]  In [49]: """ This cell will create an extremely simple handle_data that will keep 100% of our portfolio into the SPY and I'll plot against the algorithm defined above. """ # I set the start and end date I want my algo to run for start_algo = '2002-01-01' end_algo = '2014-12-31' # I make a series out of just the SIDs. SIDs = CEOs_outliers.SID # Then call get_pricing on the series of SIDs and store the results in a new dataframe called prices. data = get_pricing( SIDs, start_date= start_algo, end_date= end_algo, fields ='close_price', handle_missing='ignore' ) #: Here I'm defining the algo that I have above so I can run with a new graphing method my_algo_outliers = TradingAlgorithm( initialize=initialize, handle_data=handle_data ) #: Insert our analyze methods my_algo_outliers._analyze = my_algo_analyze # Run algorithms returns_outliers = my_algo_outliers.run(data)  In [50]: pyplot.figure(figsize=[16,10]) benchmark_bt.risk.benchmark_period_return.plot() returns_outliers.algorithm_period_return.plot() pyplot.ylabel('% Cumulative Return', fontsize=20) pyplot.title("Cumulative Return", fontsize=20) pyplot.grid(b=None, which='major', axis='both') pyplot.box(on=None) pyplot.legend(['SPY', 'Fortune 1000 Female CEOs'], frameon=False, loc='best')  Out[50]: <matplotlib.legend.Legend at 0x7ffb6ae177d0> In [51]: bench_tot_return = benchmark_bt.risk.benchmark_period_return[-1] algo_tot_return = my_algo_outliers.perf_tracker.cumulative_risk_metrics.algorithm_cumulative_returns[-1] bench_pct_ret = bench_tot_return * 100 algo_pct_ret = algo_tot_return * 100 bench_algo_diff = (algo_tot_return - bench_tot_return) * 100 print "Algo Percent Returns %s" % algo_pct_ret print "Benchmark Percent Returns %s" % bench_pct_ret print "Difference %s" % bench_algo_diff  Algo Percent Returns 266.5993303 Benchmark Percent Returns 122.305952214 Difference 144.293378086  ## Sectors¶ In [52]: sectors = local_csv('CEOs_sector_output_v2.csv') sector_count = sectors['sector'].value_counts(sort=False) sector_count.plot(kind='bar') pyplot.grid(b=None, which='major', axis='both') pyplot.box(on=None)  It does look like I have a slight bias towards consumer cyclical companies. These include companies such as, GM, eBay, The New York Times and Ann Taylor Stores. The next question might be to ask, "Is my sector weighting responsible for the performance?" Using XLY, a consumer discretionary ETF, we can get a comparison of how consumer companies are doing against the S&P500 for the same time period. In [53]: consumer = get_pricing(['XLY','SPY'], start_date = '2002-01-02', end_date = '2015-02-01', fields = 'close_price') def cum_returns(df): return (1 + df).cumprod() - 1 cum_returns(consumer.pct_change()).plot()  Out[53]: <matplotlib.axes._subplots.AxesSubplot at 0x7ffb6ad7dcd0> # Sector Neutral¶ This sector neutral version of the algo, attempts to remove the bais towards consumer companies that the original algo has. It does this by first determining the number of sectors that the portfolio holds each time it is rebalanced, and then dividing the portfolio value by the number of sectors. It then determines the number of companies per sector, and divides the portfolio value per sector by the number of companies in the give sector. This ensures that all sectors are invested in equally. In [54]: sectors_data = local_csv('CEOs_sector_output_v2.csv') def get_sec_SID(row): temp_sid = row['SID'] row['SID'] = symbols(temp_sid) return row sectors_data = sectors_data.apply(get_sec_SID, axis=1) CEOs = pd.merge(CEOs, sectors_data, how='left')  In [55]: """ This is where I initialize my algorithm """ from zipline.api import order from zipline.finance.slippage import FixedSlippage def initialize(context): #load the CEO data and a variable to count the number of stocks held at any time as global variables context.CEOs = CEOs context.current_stocks = [] context.stocks_to_order_today = [] context.stocks_to_sell_today = [] context.set_slippage(FixedSlippage(spread=0)) context.num_sectors = 0  In [56]: """ Handle data is the function that is running every minute (or day) looking to make trades """ from zipline.api import order def handle_data(context, data): #: Set my order and sell dictionaries to empty at the start of any day. context.stocks_to_order_today = [] context.stocks_to_sell_today = [] current_CEOs = context.CEOs # Get todays date. today = get_datetime() # Get a dataframe with just the companies where start_date (or end date) is today. context.stocks_to_order_today = context.CEOs.SID[context.CEOs.start_date==today].tolist() context.stocks_to_sell_today= context.CEOs.SID[context.CEOs.end_date==today].tolist() context.stocks_to_sell_today = [s for s in context.stocks_to_sell_today if s!= None] context.stocks_to_order_today = [s for s in context.stocks_to_order_today if s!= None] # If there are stocks that need to be bought or sold today if (len(context.stocks_to_order_today) > 0) or (len(context.stocks_to_sell_today) > 0): # First sell any that need to be sold, and remove them from current_stocks. for stock in context.stocks_to_sell_today: if stock in data: if stock in context.current_stocks: order_target(stock,0) context.current_stocks.remove(stock) #print "Selling %s" % stock # Then add any I am buying to current_stocks. for stock in context.stocks_to_order_today: context.current_stocks.append(stock) #get the list of current CEOs so that we can find the sector information current_CEOs = context.CEOs[context.CEOs.SID.isin(context.current_stocks)] #count the number of sectors context.num_sectors = current_CEOs.sector_id.nunique() #calculate the value to buy #get the current portfolio value portfolio_value = context.portfolio.portfolio_value #get the value to be invested in each sector value_per_sector = portfolio_value/context.num_sectors #series of sectors and the number of companies in the sector sector_count = current_CEOs['SID'].groupby(current_CEOs['sector_id']).count() # Then rebalance the portfolio so I have and equal amount of each stock in current_stocks. for stock in context.current_stocks: if stock in data: #get the sector of the current company current_company_sector = context.CEOs.sector_id[(context.CEOs.SID == stock)].iloc[0] #get the number of companies in that sector num_companies_in_sector = sector_count.loc[current_company_sector] #calculate the amount to invest in this company value_to_buy = value_per_sector/num_companies_in_sector #place order that amount of this stock order_target_value(stock,value_to_buy)  In [57]: """ This cell gets the historical pricing data for all the SIDs in my universe. Then kicks off my algo using that data. """ # I set the start and end date I want my algo to run for start_algo = '2002-01-01' end_algo = '2014-12-30' # I make a series out of just the SIDs. SIDs = CEOs.SID # Then call get_pricing on the series of SIDs and store the results in a new dataframe called prices. data = get_pricing( SIDs, start_date= start_algo, end_date= end_algo, fields ='close_price', handle_missing='ignore' ) #: Here I'm defining the algo that I have above so I can run with a new graphing method my_algo_sectors = TradingAlgorithm( initialize=initialize, handle_data=handle_data ) # Run algorithms returns_sectors = my_algo_sectors.run(data)  In [59]: pyplot.figure(figsize=[16,10]) benchmark_bt.risk.benchmark_period_return.plot() returns_sectors.algorithm_period_return.plot() pyplot.ylabel('% Cumulative Return', fontsize=20) pyplot.title("Cumulative Return", fontsize=20) pyplot.grid(b=None, which='major', axis='both') pyplot.box(on=None) pyplot.legend(['SPY', 'Fortune 1000 Female CEOs'], frameon=False, loc='best')  Out[59]: <matplotlib.legend.Legend at 0x7ffb68806bd0> In [60]: bench_tot_return = benchmark_bt.risk.benchmark_period_return.iloc[-1] algo_tot_return = my_algo_sectors.perf_tracker.cumulative_risk_metrics.algorithm_cumulative_returns[-1] bench_pct_ret = bench_tot_return * 100 algo_pct_ret = algo_tot_return * 100 bench_algo_diff = (algo_tot_return - bench_tot_return) * 100 print "Algo Percent Returns %s" % algo_pct_ret print "Benchmark Percent Returns %s" % bench_pct_ret print "Difference %s" % bench_algo_diff  Algo Percent Returns 275.3446756 Benchmark Percent Returns 122.305952214 Difference 153.038723386  ## WIL and PAX¶ There are at least two existing funds with a gender focus and I've been told there are as many as 17 gender focused investment products. The Pax Global Women’s Leadership Index (PXWIX) is the first broad-market index of the highest-rated companies in the world in advancing women’s leadership The Women In Leadership index (WIL) tracks a weighted index of 85 U.S.-based companies that are listed on the NYSE or NASDAQ, have market capitalizations of at least$250 million, and have a woman CEO or a board of directors that’s at least 25% female.

Here is a look at them, plotted against the SPY. We can use these as a decent reference.

In [61]:
funds = local_csv("Womens_Funds.csv", date_column='Date')
funds = funds.sort_index(ascending=True)
funds['SPY'] = get_pricing('SPY', start_date='2002-01-02', end_date='2015-02-19', fields='close_price')

def cum_returns(df):
return (1 + df).cumprod() - 1

cum_returns(funds.pct_change()).plot()

Out[61]:
<matplotlib.axes._subplots.AxesSubplot at 0x7ffb68782a50>