Series结构

news/2024/7/7 15:34:21
  1. 读取csv文件:

    import pandas as pd
    fandango = pd.read_csv('fandango_score_comparison.csv')
    series_film = fandango['FILM']
    print(series_film[0:5])
    series_rt = fandango['RottenTomatoes']
    print (series_rt[0:5])
    

    运行结果:
    在这里插入图片描述

  2. 制作Series

    # Import the Series object from pandas
    from pandas import Series
    
    film_names = series_film.values
    #print type(film_names)
    #print film_names
    rt_scores = series_rt.values
    #print rt_scores
    series_custom = Series(rt_scores , index=film_names)
    series_custom[['Minions (2015)', 'Leviathan (2014)']]
    

    运行结果:
    在这里插入图片描述

  3. 打印结果

    # int index is also aviable
    series_custom = Series(rt_scores , index=film_names)
    series_custom[['Minions (2015)', 'Leviathan (2014)']]
    fiveten = series_custom[5:10]
    print(fiveten)
    

    运行结果:
    在这里插入图片描述

  4. 排序

    original_index = series_custom.index.tolist()
    print (original_index)
    sorted_index = sorted(original_index)
    print(sorted_index)
    sorted_by_index = series_custom.reindex(sorted_index)
    print (sorted_by_index)
    

    运行结果:
    在这里插入图片描述

  5. 排序索引

    sc2 = series_custom.sort_index()
    sc3 = series_custom.sort_values()
    #print(sc2[0:10])
    print(sc3[0:10])
    

    运行结果:
    在这里插入图片描述

  6. 相加

    #The values in a Series object are treated as an ndarray, the core data type in NumPy
    import numpy as np
    # Add each value with each other
    print (np.add(series_custom, series_custom))
    # Apply sine function to each value
    np.sin(series_custom)
    # Return the highest value (will return a single value not a Series)
    np.max(series_custom)
    

    运行结果:
    在这里插入图片描述

  7. 判断

    #will actually return a Series object with a boolean value for each film
    series_custom > 50
    series_greater_than_50 = series_custom[series_custom > 50]
    
    criteria_one = series_custom > 50
    criteria_two = series_custom < 75
    both_criteria = series_custom[criteria_one & criteria_two]
    print (both_criteria)
    

    运行结果:
    在这里插入图片描述

  8. 运算

    #data alignment same index
    rt_critics = Series(fandango['RottenTomatoes'].values, index=fandango['FILM'])
    rt_users = Series(fandango['RottenTomatoes_User'].values, index=fandango['FILM'])
    rt_mean = (rt_critics + rt_users)/2
    
    print(rt_mean)
    

    运行结果:
    在这里插入图片描述

  9. set_index

    #will return a new DataFrame that is indexed by the values in the specified column 
    #and will drop that column from the DataFrame
    #without the FILM column dropped 
    fandango = pd.read_csv('fandango_score_comparison.csv')
    print type(fandango)
    fandango_films = fandango.set_index('FILM', drop=False)
    #print(fandango_films.index) 
    

    运行结果:
    在这里插入图片描述

  10. 使用新索引

    # Slice using either bracket notation or loc[]
    fandango_films["Avengers: Age of Ultron (2015)":"Hot Tub Time Machine 2 (2015)"]
    fandango_films.loc["Avengers: Age of Ultron (2015)":"Hot Tub Time Machine 2 (2015)"]
    
    # Specific movie
    fandango_films.loc['Kumiko, The Treasure Hunter (2015)']
    
    # Selecting list of movies
    movies = ['Kumiko, The Treasure Hunter (2015)', 'Do You Believe? (2015)', 'Ant-Man (2015)']
    fandango_films.loc[movies]
    
    #When selecting multiple rows, a DataFrame is returned, 
    #but when selecting an individual row, a Series object is returned instead
    

    运行结果:
    在这里插入图片描述

  11. 类型转化

    #The apply() method in Pandas allows us to specify Python logic
    #The apply() method requires you to pass in a vectorized operation 
    #that can be applied over each Series object.
    import numpy as np
    
    # returns the data types as a Series
    types = fandango_films.dtypes
    #print types
    # filter data types to just floats, index attributes returns just column names
    float_columns = types[types.values == 'float64'].index
    # use bracket notation to filter columns to just float columns
    float_df = fandango_films[float_columns]
    #print float_df
    # `x` is a Series object representing a column
    deviations = float_df.apply(lambda x: np.std(x))
    
    print(deviations)
    

    运行结果:
    在这里插入图片描述

  12. 匿名函数std()函数用于计算标准差

    rt_mt_user = float_df[['RT_user_norm', 'Metacritic_user_nom']]
    rt_mt_user.apply(lambda x: np.std(x), axis=1)
    

    运行结果:
    在这里插入图片描述


http://www.niftyadmin.cn/n/4714809.html

相关文章

折线图的绘制

to_datetime import pandas as pd unrate pd.read_csv(unrate.csv) unrate[DATE] pd.to_datetime(unrate[DATE]) print(unrate.head(12))运行结果&#xff1a; 绘图 from pandas.plotting import register_matplotlib_converters #%matplotlib inline #Using the different…

技术人员不应该固步自封

能力的提高不是通过量&#xff0c;而是通过质来提高的。 经常听到人们说&#xff0c;这点东西犯不到花这么大力气。 如果是学术问题&#xff0c;我觉得OK&#xff0c;确实是这样&#xff0c;因为有思路就行了。 但是技术问题则不同&#xff0c;光有想法是不够的。工程上是要…

子图的操作

读数据绘图&#xff1a; import pandas as pd from pandas.plotting import register_matplotlib_convertersunrate pd.read_csv(unrate.csv) unrate[DATE] pd.to_datetime(unrate[DATE]) first_twelve unrate[0:12] plt.plot(first_twelve[DATE], first_twelve[VALUE]) plt…

字符串相似度算法 / The Arithmetic of String Similarity Degree

dongle2001的《字符串相似度算法介绍(整理)》中提到&#xff0c;算法分为三类&#xff1a; 1、编辑距离&#xff08;Levenshtein Distance&#xff09; 编辑距离就是用来计算从原串&#xff08;s&#xff09;转换到目标串(t)所需要的最少的插入&#xff0c;删除和替换 的数目…

条形图与散点图

取出一行数据 import pandas as pd reviews pd.read_csv(fandango_scores.csv) cols [FILM, RT_user_norm, Metacritic_user_nom, IMDB_norm, Fandango_Ratingvalue, Fandango_Stars] norm_reviews reviews[cols] print(norm_reviews[:1])运行结果&#xff1a; 显示柱形图…

概要设计与详细设计 / Conceptual Design and Detail Design

概要设计与详细设计的区别 概要设计就是设计软件的结构&#xff0c;包括组成模块&#xff0c;模块的层次结构&#xff0c;模块的调用关系&#xff0c;每个模块的功能等等。同时&#xff0c;还要设计该项目的应用系统的总体数据结构和数据库结构&#xff0c;即应用系统要存储什…

柱形图和盒图

读取数据 import pandas as pd import matplotlib.pyplot as plt reviews pd.read_csv(fandango_scores.csv) cols [FILM, RT_user_norm, Metacritic_user_nom, IMDB_norm, Fandango_Ratingvalue] norm_reviews reviews[cols] print(norm_reviews[:5])运行结果&#xff1a; …

C#的多线程 / Multi-Tread of C#

注&#xff1a;本文中出现的代码均在.net Framework RC3环境中运行通过 一.多线程的概念 Windows是一个多任务的系统&#xff0c;如果你使用的是windows 2000及其以上版本&#xff0c;你可以通过任务管理器查看当前系统运行的程序和进程。什么是进程呢&#xff1f;当一个程…