In the last post, we had imported data from CSV in a dataframe. I must tell you that operation in dataframe takes less time and they are more efficient and easy to operate. Dataframe could be a huge topic for a python programmer, but as a trader cum programmer, this is simple and we are only going to use limited functions.
In the last example we had imported data contained in a CSV file in a dataframe vide code df = pd.read_csv("myDataFile.csv")
Each column of data can be accessed separately as:
df['Open']
df['High']
df['Low']
df['Close']
df['Volume']
The analogy with Excel operations - Arithmetic Operations
If you have excel or any other worksheet experience,
understanding dataframe operation would be easy.
Assuming, you were working
on excel, you could have calculated the range by subtracting low values from the
high value as shown below:
The formula in excel could now have been copied for all
other cells for the calculation. Similarly, in a dataframe, you can do operations
between two columns. However, you can only name the column by not the row
number. For example in excel, we have written c3-d4, instead, you are only
allowed “c-d” in the dataframe and you can refer to a column with the title of the
column.
Lets say you have a data in CSV and you read it and make a
dataframe as I discussed in my earlier post
Datetime |
High |
Low |
Range |
2021-09-01T00:00:00 |
650 |
636 |
14 |
2021-09-02T00:00:00 |
648 |
640.5 |
7.5 |
2021-09-03T00:00:00 |
649.9 |
637.6 |
12.3 |
2021-09-06T00:00:00 |
661 |
638.2 |
22.8 |
2021-09-07T00:00:00 |
648.4 |
631 |
17.4 |
2021-09-08T00:00:00 |
641.15 |
631 |
10.15 |
2021-09-09T00:00:00 |
635 |
626.4 |
8.6 |
You can perform the above operation in python dataframe as :
df[‘Range’] = df[‘High’]-df[‘Low’]
If no column with name Range exists in your dataframe a new
column will be created.
Now, you can use the same method for doing all other Arithmetic Operations like addition, multiplication or divide. You can view your dataframe anytime by using print command.
print(df)
Data in dataframe can be assumed as under :
Index |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
Close |
14.5 |
14.7 |
15.0 |
15.5 |
14.3 |
15.1 |
15.6 |
16.2 |
15.4 |