Now that we understand how to read and write data, we can then learn how to modify our data and do things like moving columns, deleting columns, renaming columns, or referencing specific columns.
import pandas as pd df = pd.read_csv('sp500_ohlc.csv', index_col = 'Date', parse_dates=True) print(df.head()) df2 = df['Open'] print(df2.head())
Here, we've done our typical import of pandas, and then read in our CSV file. Then, we define a new variable, df2, which we're saying is equal do just the open column of df. This of course still retains the index.
What if we want to do multiple columns? Here we reference Close and High for our dataset.
df3 = df[['Close','High']] print(df3.head())
How about renaming columns? This is done with the .rename() function, where you specify what you want to rename in a sort of dictionary.
df3.rename(columns={'Close': 'CLOSE!!'}, inplace=True) print(df3.head())
What about referencing specific data only? Here we say we just want to see the data that has a close of over 1400:
df4 = df3[(df3['CLOSE!!'] > 1400)] print(df4)