Indexing and Selecting Data in Python – How to slice, dice for Pandas Series and DataFrame

Introduction

Indexing and Selecting Data

  • Enables automatic and explicit data alignment.
  • Allows intuitive getting and setting of subsets of the data set.

The query() Method

#creating dataframe of 10 rows and 3 columns
df4 = pd.DataFrame(np.random.rand(10, 3), columns=list('abc'))
df4

Image for post

Image for post

#with query()
df4.query('(x < b) & (b < c)')

Image for post

  • drop_duplicates: removes duplicate rows.
df5 = pd.DataFrame({'a': ['one', 'one', 'two', 'two', 'two'],
                    'b': ['x', 'y', 'x', 'y', 'x'],
                    'c': np.random.randn(5)})
df5

Image for post

df5.duplicated('a')

Image for post

df5.drop_duplicates('a')

Image for post

  1. Interesting 10 Machine Learning and Data Science Projects with Datasets
  2. Basic Understanding of NLP With Python

Author: admin

Leave a Reply

Your email address will not be published.