Want to replace some values, but not others, in your Pandas series or data frame? In this video, I introduce "where" and "mask", which help you accomplish this.
Пікірлер: 4
@imothar25 күн бұрын
Another great video👍 Just wondering if there were any specific reason why did not use pd.NA? Perhaps it's the same result in the end, when it comes to floats 🤷
@ReuvenLerner25 күн бұрын
The future of Pandas is clearly pd.NA, and I should use it more! But in this particular case, it didn't make a difference: Using either np.nan or pd.NA will turn the dtype into floats. That's because the standard int type isn't nullable, meaning that it cannot handle pd.NA as anything other than a float. If, however, you were to set the dtype to be Int64 (note the capital), then using pd.NA would indeed do what you (and I) want.
@marcinpohl326426 күн бұрын
How do i use np.NaN in a way that does NOT change ints to floats?
@ReuvenLerner26 күн бұрын
NaN is a float. So if you want to have NaN in an int column, then the ints will need to change to floats. HOWEVER, if you create your series with a nullable type, then you can use pd.NA instead of np.nan, and you'll be all set. That's because pd.NA is compatible with a wide variety of types: In [12]: s = Series([10, 20, 30, 40, 50]) In [13]: s.loc[3] = pd.NA In [14]: s Out[14]: 0 10.0 1 20.0 2 30.0 3 NaN 4 50.0 dtype: float64 In [15]: s = Series([10, 20, 30, 40, 50], dtype='Int64') In [16]: s.loc[3] = pd.NA In [17]: s Out[17]: 0 10 1 20 2 30 3 4 50 dtype: Int64