forked from 170010011/fr
				
			
		
			
				
	
	
		
			96 lines
		
	
	
		
			4.6 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
			
		
		
	
	
			96 lines
		
	
	
		
			4.6 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
Metadata-Version: 2.1
 | 
						|
Name: pandas
 | 
						|
Version: 1.2.2
 | 
						|
Summary: Powerful data structures for data analysis, time series, and statistics
 | 
						|
Home-page: https://pandas.pydata.org
 | 
						|
Maintainer: The PyData Development Team
 | 
						|
Maintainer-email: pydata@googlegroups.com
 | 
						|
License: BSD
 | 
						|
Project-URL: Bug Tracker, https://github.com/pandas-dev/pandas/issues
 | 
						|
Project-URL: Documentation, https://pandas.pydata.org/pandas-docs/stable/
 | 
						|
Project-URL: Source Code, https://github.com/pandas-dev/pandas
 | 
						|
Platform: any
 | 
						|
Classifier: Development Status :: 5 - Production/Stable
 | 
						|
Classifier: Environment :: Console
 | 
						|
Classifier: Operating System :: OS Independent
 | 
						|
Classifier: Intended Audience :: Science/Research
 | 
						|
Classifier: Programming Language :: Python
 | 
						|
Classifier: Programming Language :: Python :: 3
 | 
						|
Classifier: Programming Language :: Python :: 3.7
 | 
						|
Classifier: Programming Language :: Python :: 3.8
 | 
						|
Classifier: Programming Language :: Python :: 3.9
 | 
						|
Classifier: Programming Language :: Cython
 | 
						|
Classifier: Topic :: Scientific/Engineering
 | 
						|
Requires-Python: >=3.7.1
 | 
						|
Requires-Dist: python-dateutil (>=2.7.3)
 | 
						|
Requires-Dist: pytz (>=2017.3)
 | 
						|
Requires-Dist: numpy (>=1.16.5)
 | 
						|
Provides-Extra: test
 | 
						|
Requires-Dist: pytest (>=5.0.1) ; extra == 'test'
 | 
						|
Requires-Dist: pytest-xdist ; extra == 'test'
 | 
						|
Requires-Dist: hypothesis (>=3.58) ; extra == 'test'
 | 
						|
 | 
						|
 | 
						|
**pandas** is a Python package that provides fast, flexible, and expressive data
 | 
						|
structures designed to make working with structured (tabular, multidimensional,
 | 
						|
potentially heterogeneous) and time series data both easy and intuitive. It
 | 
						|
aims to be the fundamental high-level building block for doing practical,
 | 
						|
**real world** data analysis in Python. Additionally, it has the broader goal
 | 
						|
of becoming **the most powerful and flexible open source data analysis /
 | 
						|
manipulation tool available in any language**. It is already well on its way
 | 
						|
toward this goal.
 | 
						|
 | 
						|
pandas is well suited for many different kinds of data:
 | 
						|
 | 
						|
  - Tabular data with heterogeneously-typed columns, as in an SQL table or
 | 
						|
    Excel spreadsheet
 | 
						|
  - Ordered and unordered (not necessarily fixed-frequency) time series data.
 | 
						|
  - Arbitrary matrix data (homogeneously typed or heterogeneous) with row and
 | 
						|
    column labels
 | 
						|
  - Any other form of observational / statistical data sets. The data actually
 | 
						|
    need not be labeled at all to be placed into a pandas data structure
 | 
						|
 | 
						|
The two primary data structures of pandas, Series (1-dimensional) and DataFrame
 | 
						|
(2-dimensional), handle the vast majority of typical use cases in finance,
 | 
						|
statistics, social science, and many areas of engineering. For R users,
 | 
						|
DataFrame provides everything that R's ``data.frame`` provides and much
 | 
						|
more. pandas is built on top of `NumPy <https://www.numpy.org>`__ and is
 | 
						|
intended to integrate well within a scientific computing environment with many
 | 
						|
other 3rd party libraries.
 | 
						|
 | 
						|
Here are just a few of the things that pandas does well:
 | 
						|
 | 
						|
  - Easy handling of **missing data** (represented as NaN) in floating point as
 | 
						|
    well as non-floating point data
 | 
						|
  - Size mutability: columns can be **inserted and deleted** from DataFrame and
 | 
						|
    higher dimensional objects
 | 
						|
  - Automatic and explicit **data alignment**: objects can be explicitly
 | 
						|
    aligned to a set of labels, or the user can simply ignore the labels and
 | 
						|
    let `Series`, `DataFrame`, etc. automatically align the data for you in
 | 
						|
    computations
 | 
						|
  - Powerful, flexible **group by** functionality to perform
 | 
						|
    split-apply-combine operations on data sets, for both aggregating and
 | 
						|
    transforming data
 | 
						|
  - Make it **easy to convert** ragged, differently-indexed data in other
 | 
						|
    Python and NumPy data structures into DataFrame objects
 | 
						|
  - Intelligent label-based **slicing**, **fancy indexing**, and **subsetting**
 | 
						|
    of large data sets
 | 
						|
  - Intuitive **merging** and **joining** data sets
 | 
						|
  - Flexible **reshaping** and pivoting of data sets
 | 
						|
  - **Hierarchical** labeling of axes (possible to have multiple labels per
 | 
						|
    tick)
 | 
						|
  - Robust IO tools for loading data from **flat files** (CSV and delimited),
 | 
						|
    Excel files, databases, and saving / loading data from the ultrafast **HDF5
 | 
						|
    format**
 | 
						|
  - **Time series**-specific functionality: date range generation and frequency
 | 
						|
    conversion, moving window statistics, date shifting and lagging.
 | 
						|
 | 
						|
Many of these principles are here to address the shortcomings frequently
 | 
						|
experienced using other languages / scientific research environments. For data
 | 
						|
scientists, working with data is typically divided into multiple stages:
 | 
						|
munging and cleaning data, analyzing / modeling it, then organizing the results
 | 
						|
of the analysis into a form suitable for plotting or tabular display. pandas is
 | 
						|
the ideal tool for all of these tasks.
 | 
						|
 | 
						|
 |