Today 21.3.7
- Python aggregate 헷갈렸던 내용들 복습하기
Efficient summaries
# medium 구하기
# try 1)
def iqr(column):
return column.quantile(0.75) - column.quantile(0.25)# try 2)
print(sales['temperate_c'].agg(iqr))# 1) and 2) have the same result.
culmulative statistics
sales = sales.sort_values(by = 'date')
sales['cum_weekly_sales'] = sales['weekly_sales'].cumsum()sales['cum_max_sales'] = sales['weekly_sales'].cummax()print(sales[['date','weekly_sales','cum_weekly_sales', 'cum_max_sales']])
dropping duplicates
sales.drop_duplicates(subset = ['store','type'])# drop duplicates with conditionsales[sales['is_holiday'] == True].drop_duplicates(subset = 'date')
Counting categorical variables
# number count
store['type'].value_counts()# proportion 총합이 1
store['type'].value_counts(normalize = True)# 나열하기
store['type'].value_counts(sort = True)
Summary
공부해도 다시 보면 또 기억이 잘 안난다.
꾸준히 해야지