Stocks, Bonds, and Beyond: Unveiling the Best Portfolio Mix (backed with data)

Asset allocation is a foundational investment strategy that balances risk and reward by distributing a portfolio across diverse asset classes, such as equities, fixed income, and cash equivalents. The approach aims to optimize returns while minimizing risks, considering factors like an investor’s financial goals, risk tolerance, and investment horizon. By diversifying assets with varying performance patterns, asset allocation reduces volatility and enhances long-term stability. Introduced by economist Harry Markowitz in 1952, the theory emphasizes that combining uncorrelated assets can lower overall portfolio risk. This disciplined strategy helps investors focus on long-term objectives rather than reacting to short-term market fluctuations.

The main 3 key points of this strategy are:

  • Diversification: Asset allocation involves spreading investments across various asset classes (e.g., equities, bonds, cash) to manage risk effectively and reduce the impact of poor performance in any single category
  • Risk and Return Balance: It aligns investment choices with an individual’s risk tolerance, financial goals, and time horizon, ensuring an appropriate balance between potential returns and acceptable risk levels
  • Long-term Focus: A well-structured asset allocation strategy promotes consistent returns over time and helps investors stay committed to long-term objectives, avoiding impulsive reactions to short-term market fluctuations

The 3 different approaches

Asset allocation strategies are tailored to align with different investment objectives, typically categorized into growth, balanced, and income approaches.

growth-oriented strategy focuses on maximizing long-term capital appreciation by allocating a significant portion of the portfolio to higher-risk, high-return assets such as equities and real estate. These investments are suitable for investors with a longer time horizon and higher risk tolerance, as they aim to generate substantial returns over time but may experience greater volatility during market fluctuations

In contrast, a balanced strategy seeks to achieve a mix of growth and income by diversifying across asset classes. Portfolios in this category often include a combination of equities (for growth) and fixed-income securities (for stability and income), with smaller allocations to cash or alternative investments for added diversification. This approach is designed for investors who want moderate risk exposure while benefiting from both capital appreciation and steady income streams

Lastly, an income-focused strategy prioritizes generating regular income through investments in low-risk, income-producing assets like bonds, dividend-paying stocks, or real estate investment trusts (REITs). This approach is ideal for conservative investors or those nearing retirement who prefer stability and predictable cash flows over high growth potential

Each strategy reflects varying risk tolerance levels, time horizons, and financial goals, allowing investors to align their portfolios with their specific needs.

Goal of this article

I have tried to provide a very high-level approach to this strategy. If you want to dive deeper into the theory, there are plenty of resources over the internet. In this article, I will go more technical, trying to understand those weights between the various asset classes through some Python code and present my findings.

Let’s get into the data

In the analysis, I will use representative instruments to evaluate the performance and allocation of five key asset classes — Stocks, Bonds, Commodities, Real Estate, and Money Market Funds (MMF). These instruments provide a practical way to understand the behavior of each asset class and their role in different portfolio strategies.

  • ^GSPC (S&P 500 Index): This represents the U.S. stock market, tracking the performance of 500 large-cap companies, making it a benchmark for equity investments.
  • BND (Vanguard Total Bond Market ETF): A comprehensive bond fund that tracks the performance of the U.S. investment-grade bond market, including government, corporate, and mortgage-backed securities.
  • GD=F (Gold Futures): A proxy for commodities, specifically gold, which is often used as a hedge against inflation and market volatility.
  • VNQ (Vanguard Real Estate ETF): Tracks the performance of publicly traded real estate investment trusts (REITs), providing exposure to the real estate sector.
  • JPMorgan Prime Money Market Fund (VMVXX): Represents short-term, low-risk investments that provide liquidity and stability, often used as a cash-equivalent asset class.

These instruments will allow for a detailed analysis of how each asset class performs individually and in combination with different asset allocation strategies.

import pandas as pd
import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt
from tqdm import tqdm

prices = ['^GSPC', 'BND', 'GD=F', 'VNQ', 'MMF']
asset_names = ['STOCKS', 'BONDS', 'COMMODITIES', 'REAL-ESTATE' , 'MMF']

df = pd.DataFrame()
for i, ticker in enumerate(prices):
    if ticker != 'MMF':
        data = yf.download(ticker, start='2000-01-01')
        df[asset_names[i]] = data['Adj Close']

df = df.dropna()
df.dropna(inplace=True)
print(f'We have data from {df.index.min()}')




Above, I am getting the prices of the 4 instruments from Yahoo Finance. This way, I managed to get data from early 2007 (almost 16 years). Unfortunately, I did not manage to get any automated feed for the money market fund, so I will add a yearly average return in the code from an online research.

# Create a date range from 2007 to today
date_range = pd.date_range(start='2007-01-01', end=pd.Timestamp.today(), freq='D')
date_df = pd.DataFrame(date_range, columns=['Date'])

# Rates DataFrame with Year and Annual Return (%)
rates_data = {
    "Year": [2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016,
             2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024],
    "Annual Return (%)": [4.5, 2.0, 0.5, 0.1, 0.1, 0.1, 0.1, 0.1,
                          0.2, 0.3, 0.8, 1.8, 2.0, 0.5, 0.0, 1.6, 5.3, 5.0]
}
rates_df = pd.DataFrame(rates_data)
# Calculate daily compounded rate for each year
rates_df['Daily Rate'] = (1 + rates_df['Annual Return (%)'] / 100) ** (1/365) - 1
# Map daily rates to dates
date_df['Year'] = date_df['Date'].dt.year
date_df = date_df.merge(rates_df[['Year', 'Daily Rate']], on='Year', how='left')

# Calculate the equity curve starting from an initial value of $100
equity_curve = [100]  # Start with an initial value of $100
for i in range(1, len(date_df)):
    previous_equity = equity_curve[-1]
    daily_rate = date_df.iloc[i]['Daily Rate']
    equity_curve.append(previous_equity * (1 + daily_rate))

date_df['Equity Curve'] = equity_curve
df['MMF'] = date_df.set_index('Date')['Equity Curve']

# calculate #pct_cjhanges
for asset in asset_names:
    df[f'{asset}_pctChange'] = df[asset].pct_change(fill_method=None).fillna(0)
df




Now we have a dataframe with all the assets and their daily percentage change.

To calculate the outcome of each allocation, I will need to define 3 metrics: maximum drawdown, shape ratio, and volatility. Let’s create the functions that we will use

def calculate_max_drawdown(equity_curve):
    if not isinstance(equity_curve, pd.Series):
        raise ValueError("Equity curve should be a pandas Series")

    cumulative_max = equity_curve.cummax()
    drawdown = (equity_curve - cumulative_max) / cumulative_max
    max_drawdown = drawdown.min() * 100  # Convert to percentage
    return max_drawdown

def calculate_sharpe_ratio(equity_curve, risk_free_rate=0.01):
    if not isinstance(equity_curve, pd.Series):
        raise ValueError("Equity curve should be a pandas Series")

    # Calculate daily returns
    daily_returns = equity_curve.pct_change().dropna()

    # Calculate excess returns over risk-free rate
    excess_returns = daily_returns - (risk_free_rate / 252)

    # Calculate annualized Sharpe Ratio
    sharpe_ratio = np.mean(excess_returns) / np.std(excess_returns) * np.sqrt(252)

    return sharpe_ratio

def calculate_volatility(equity_curve):
    if not isinstance(equity_curve, pd.Series):
        raise ValueError("Input should be a pandas Series")

    returns = equity_curve.pct_change().dropna()
    volatility = returns.std() * np.sqrt(252)  # Annualize the volatility
    return volatility




The next step is to calculate the metrics for each combination of asset allocation. You will notice below a different approach to the money market fund. This is because of a problem that I encountered while doing the analysis. The MMF is practically cash. There is zero (or close to it…) risk, volatility, or drawdown. While doing the analysis, this gave so much preference to MMF that it practically proposed only CASH! One method to address the issue is to calculate all the metrics (except for the return), excluding the cash allocation. I considered this the best approach — but I stand to be corrected. There would also be other logical approaches, like penalizing the metrics of cash, and others.

# Generate combinations with the sum of 100
combinations = [(a, b, c, d, e) for a in range(0, 101, 5)
                for b in range(0, 101, 5)
                for c in range(0, 101, 5)
                for d in range(0, 101, 5)
                for e in range(0, 101, 5) if a + b + c + d + e == 100 and e < 100]

print(f'There are {len(combinations)} combinations')

results = []
# to keed the dfs with equity curves calculated for each combination
results_dfs = {}

for combination in tqdm(combinations):

    df_copy = df.copy()
    df_copy['combined_pctChange'] = 0
    df_copy['combined_pctChange_exc_cash'] = 0

    for i, asset in enumerate(asset_names):
        df_copy[f'{asset}_weight'] = combination[i]/100
        df_copy[f'{asset}_weighted_pctChange'] = df_copy[f'{asset}_weight'] * df_copy[f'{asset}_pctChange']
        df_copy['combined_pctChange'] = df_copy['combined_pctChange'] + df_copy[f'{asset}_weighted_pctChange']
    df_copy['equity_curve'] = (1 + df_copy['combined_pctChange']).cumprod() * 100

    # will also create an equity curve without the cash
    base_total_weight = sum(combination) - combination[4]
    for i, asset in enumerate(asset_names):
        if asset != 'MMF':
            df_copy[f'{asset}_weight_exc_cash'] = combination[i]/base_total_weight
            df_copy[f'{asset}_weighted_pctChange_exc_cash'] = df_copy[f'{asset}_weight_exc_cash'] * df_copy[f'{asset}_pctChange']
            df_copy['combined_pctChange_exc_cash'] = df_copy['combined_pctChange_exc_cash'] + df_copy[f'{asset}_weighted_pctChange_exc_cash']
    df_copy['equity_curve_exc_cash'] = (1 + df_copy['combined_pctChange_exc_cash']).cumprod() * 100


    final_equity_return = (df_copy['equity_curve'].iloc[-1] - 100) / 100 * 100  # Convert final equity return to percentage
    max_drawdown = calculate_max_drawdown(df_copy['equity_curve_exc_cash'])
    sharpe_ratio = calculate_sharpe_ratio(df_copy['equity_curve_exc_cash'])
    volatility = calculate_volatility(df_copy['equity_curve_exc_cash'])

    combination_tag = '_'.join(map(str,combination))

    results_dfs[combination_tag] = df_copy

    results.append((*combination, final_equity_return, max_drawdown, sharpe_ratio, volatility))

results = pd.DataFrame(results, columns=asset_names + ['final_equity_return', 'max_drawdown', 'sharpe_ratio', 'volatility'])
results




With the above part of the code, we have a dataframe named results with metrics for all the possible combinations between the weights and their final results.

To apply our weights, we should normalize our ratios and metrics:

# List of columns to normalize
columns_to_normalize = ['final_equity_return', 'max_drawdown', 'sharpe_ratio', 'volatility']

def z_score_normalize(data):
    return (data - data.mean()) / data.std()

# Normalize only the selected columns and create new columns with _norm suffix
for column in columns_to_normalize:
    results[column + '_norm'] = z_score_normalize(results[column])




At the beginning of this article, I talked about the 3 approaches of growth, balance, and income. For that reason, I will create a dictionary with the weights for each approach and each metric, so we see the results for each and every one.

weights_growth = {
    'final_equity_return': 0.55,
    'max_drawdown': 0.15,
    'sharpe_ratio': 0.2,
    'volatility': 0.1
}

weights_balanced = {
    'final_equity_return': 0.35,
    'max_drawdown': 0.2,
    'sharpe_ratio': 0.35,
    'volatility': 0.1
}

weights_income = {
    'final_equity_return': 0.1,
    'max_drawdown': 0.4,
    'sharpe_ratio': 0.2,
    'volatility': 0.25
}




You will notice that the return is more aggressive at the growth approach, while the income part is more on the maximum drawdown. Again, if you consider different weights appropriate, you can follow along with the code and update the above with your preferred ones.

Now is the time to apply those weights for each combination. The code below will create a final metric for each approach in our results dataframe

results['total_metric_growth'] = (results['final_equity_return_norm']*weights_growth['final_equity_return']) + (results['max_drawdown_norm']*weights_growth['max_drawdown']) + (results['sharpe_ratio_norm']*weights_growth['sharpe_ratio']) + (results['volatility_norm']*weights_growth['volatility'])

results['total_metric_balanced'] = (results['final_equity_return_norm']*weights_balanced['final_equity_return']) + (results['max_drawdown_norm']*weights_balanced['max_drawdown']) + (results['sharpe_ratio_norm']*weights_balanced['sharpe_ratio']) + (results['volatility_norm']*weights_balanced['volatility'])

results['total_metric_income'] = (results['final_equity_return_norm']*weights_income['final_equity_return']) + (results['max_drawdown_norm']*weights_income['max_drawdown']) + (results['sharpe_ratio_norm']*weights_income['sharpe_ratio']) + (results['volatility_norm']*weights_income['volatility'])

results




Let’s find out which are the best combinations for each approach:

results_copy = results.copy()
# Select the row with the maximum total_metric_growth
max_growth_row = results.loc[results_copy['total_metric_growth'].idxmax()]
max_balanced_row = results.loc[results_copy['total_metric_balanced'].idxmax()]
max_income_row = results.loc[results_copy['total_metric_income'].idxmax()]
best = pd.DataFrame([max_growth_row, max_balanced_row, max_income_row])

# Display the selected rows
best




That is quite interesting! For growth and balance, the best approach for the last 17 years is to fully invest in stocks, while for income, 30% of stocks and 70% of bonds are the preferred ones. That apparently is quite extreme, so (for now) let’s keep stocks and bonds and see how different weights would affect the equity curve.

plt.figure(figsize=(12, 6))

plt.plot(results_dfs['100_0_0_0_0'].index, results_dfs['100_0_0_0_0']['equity_curve'], label='100/0')
plt.plot(results_dfs['70_30_0_0_0'].index, results_dfs['70_30_0_0_0']['equity_curve'], label='70/30')
plt.plot(results_dfs['60_40_0_0_0'].index, results_dfs['60_40_0_0_0']['equity_curve'], label='60/40')
plt.plot(results_dfs['50_50_0_0_0'].index, results_dfs['50_50_0_0_0']['equity_curve'], label='50/50')
plt.plot(results_dfs['30_70_0_0_0'].index, results_dfs['30_70_0_0_0']['equity_curve'], label='30/70')
plt.plot(results_dfs['0_100_0_0_0'].index, results_dfs['0_100_0_0_0']['equity_curve'], label='0/100')

plt.title('Equity Curves Comparison')
plt.xlabel('Date')
plt.ylabel('Equity Value')
plt.legend()
plt.grid(True)
plt.show()




Now, that is interesting (and expected, if I may say). You will see that if you had invested 100% in bonds during the period till 2016, your return would have been better than stocks. After that year, the stocks take the lead, which brings us to today’s stock market “inflated” concerns…

But as we said, return should not be our only concern. Let’s take a look at the rest of the ratios. I will, of course include only the combinations that are with zero cash (MMF) and use box plots so we will be able to see the whole range of values

data = results[results['MMF'] == 0]

# Prepare grouped data for each asset class
stocks_grouped = [group['max_drawdown'] for _, group in data.groupby('STOCKS')]
bonds_grouped = [group['max_drawdown'] for _, group in data.groupby('BONDS')]
commodities_grouped = [group['max_drawdown'] for _, group in data.groupby('COMMODITIES')]
real_estate_grouped = [group['max_drawdown'] for _, group in data.groupby('REAL-ESTATE')]

# Get unique allocation percentages for x-axis labels
stocks_labels = sorted(data['STOCKS'].unique())
bonds_labels = sorted(data['BONDS'].unique())
commodities_labels = sorted(data['COMMODITIES'].unique())
real_estate_labels = sorted(data['REAL-ESTATE'].unique())

# Create subplots in a single row
fig, axes = plt.subplots(1, 4, figsize=(20, 5), sharey=True)

# Plot box plots for each asset class
axes[0].boxplot(stocks_grouped, labels=stocks_labels, showfliers=True)
axes[0].set_title('STOCKS')
axes[0].set_xlabel('Allocation (%)')
axes[0].set_ylabel('Maximum Drawdown (%)')

axes[1].boxplot(bonds_grouped, labels=bonds_labels, showfliers=True)
axes[1].set_title('BONDS')
axes[1].set_xlabel('Allocation (%)')

axes[2].boxplot(commodities_grouped, labels=commodities_labels, showfliers=True)
axes[2].set_title('COMMODITIES')
axes[2].set_xlabel('Allocation (%)')

axes[3].boxplot(real_estate_grouped, labels=real_estate_labels, showfliers=True)
axes[3].set_title('REAL-ESTATE')
axes[3].set_xlabel('Allocation (%)')

# Adjust layout and show the plot
plt.tight_layout()
plt.show()




As we can see, the maximum drawdown of bonds is more consistent. This proves that stocks, commodities, and real estate will suffer a significant drawdown at some point. The rest, stocks, commodities, and bonds, have a wide range, indicating that it depends on the allocation of bonds in the portfolio.

With a simple alteration to the code, we can get the same box plots for Sharpe Ratio.

The Sharpe ratio of stocks looks quite clear! The more stocks the better. However, we can see a lot of outliers around the 15%–35% allocation of bonds, indicating that the inclusion of bonds in the portfolio provides a much better Sharpe ratio, better than fully holding bonds!

But what about volatility?

As expected, the bonds have less volatility, and the more we hold, the better we sleep at night! What is also interesting is that the stocks’ average volatility is quite flat, indicating that when it comes to volatility, a smart way of including stocks in the portfolio will not affect much the total outcome.

One last thing before we wrap up. Besides the return, we have quite left the cash out of the game… So, let’s try and see how the cash affects the portfolio for each approach (growth, balances, and income).

asset_to_plot = 'MMF'


filtered_results = results.copy()

columns_to_scale = ['total_metric_growth', 'total_metric_balanced', 'total_metric_income']

def scale_sizes(sizes, min_size=20, max_size=1000):
    return ((sizes - np.min(sizes)) / (np.max(sizes) - np.min(sizes))) * (max_size - min_size) + min_size

# Use .loc to ensure proper indexing and avoid SettingWithCopyWarning
for column in columns_to_scale:
    filtered_results[column + '_scaled'] = scale_sizes(filtered_results[column])

# Metrics to plot
metrics = ['total_metric_growth_scaled', 'total_metric_balanced_scaled', 'total_metric_income_scaled']
filtered_data = filtered_results[[asset_to_plot] + metrics]

# Plotting
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(18, 6))



# Create a box plot for each metric with STOCKS as the x-axis
for ax, metric, title in zip(axes,
                              ['total_metric_growth_scaled', 'total_metric_balanced_scaled', 'total_metric_income_scaled'],
                              ['Growth', 'Balanced', 'Income']):
    filtered_data.boxplot(column=metric, by=asset_to_plot, ax=ax)
    ax.set_title(title)
    ax.set_xlabel(asset_to_plot)
    ax.set_ylabel('Metric Value')

plt.suptitle('')  # Remove automatic subtitle from Pandas
plt.tight_layout()
plt.show()




What is interesting is that for growth, balance, and income, the more cash, the worse our metrics are becoming. However, what is obvious (and expected) from those box plots is that growth is more affected by cash (the more cash you keep, the less growth you can expect), while balance and income are affected more or less the same.

Conclusions and What is the problem with this article?

The aim of this article was to provide some visual answers that some of you might already know. So, first things first, let’s see what is answered:

  • You always hear that in times of stock market issues, investors turn to bonds. I think that this is more than answered, and proven.
  • Cash, in any case, will hit your metrics hard. This article proves that every investor should keep a part of their cash wisely. Don’t trade money you cannot afford to lose!
  • If you are concerned about draw-downs and risk— bonds are your friends.
  • Commodities and real estate don’t give us any significant edge — while they react in a nonlinear way. Stocks and bonds are pretty clear. the more the best — the less, the worse, or the opposite, depending on the metric. What I would say is that those commodities and real estate are for advanced investors, and if you want a simple and low-maintenance strategy — stick to stocks and bonds.

But what is the problem with this article? The problem is that I have not talked about re-balancing. Well, I did not promise that 😉 Rebalancing asset allocation involves periodically adjusting the allocation of assets in your portfolio based on your analysis of the market conditions.

This is another huge concept, and soon, I will be publishing my analysis on that. So follow me! (when the next article is published, I will update it here, so if you are reading this, it means that it is not yet published…)

Feel free to use the code and play around with the weights to see how things change. The sky is the limit. For non-python proficient readers, I have updated a CSV file in GitHub to play around using Excel with all the results of the above code. The full Python notebook you can find it here.

Thanks for reading!