About 17,900,000 results
Open links in new tab
  1. Python: save pandas data frame to parquet file - Stack Overflow

    Jan 29, 2019 · Is it possible to save a pandas data frame directly to a parquet file? If not, what would be the suggested process? The aim is to be able to send the parquet file to another team, which they …

  2. What are the pros and cons of the Apache Parquet format compared …

    Apr 24, 2016 · Parquet has gained significant traction outside of the Hadoop ecosystem. For example, the Delta Lake project is being built on Parquet files. Arrow is an important project that makes it easy …

  3. Read all Parquet files saved in a folder via Spark

    You can write data into folder not as separate Spark "files" (in fact folders) 1.parquet, 2.parquet etc. If don't set file name but only path, Spark will put files into the folder as real files (not folders), and …

  4. How to read a Parquet file into Pandas DataFrame?

    How to read a modestly sized Parquet data-set into an in-memory Pandas DataFrame without setting up a cluster computing infrastructure such as Hadoop or Spark? This is only a moderate amount of data …

  5. Convert csv to parquet file using python - Stack Overflow

    May 30, 2018 · I am trying to convert a .csv file to a .parquet file. The csv file (Temp.csv) has the following format 1,Jon,Doe,Denver I am using the following python code to convert it into parquet …

  6. python - How to read a list of parquet files from S3 as a pandas ...

    import pyarrow.parquet as pq dataset = pq.ParquetDataset('parquet/') table = dataset.read() df = table.to_pandas() Both work like a charm. Now I want to achieve the same remotely with files stored …

  7. How to view Apache Parquet file in Windows? - Stack Overflow

    Jun 19, 2018 · 98 What is Apache Parquet? Apache Parquet is a binary file format that stores data in a columnar fashion. Data inside a Parquet file is similar to an RDBMS style table where you have …

  8. Reading parquet files from multiple directories in Pyspark

    May 16, 2016 · I need to read parquet files from multiple paths that are not parent or child directories. for example,

  9. Unable to infer schema when loading Parquet file

    The documentation for parquet says the format is self describing, and the full schema was available when the parquet file was saved. What gives? Using Spark 2.1.1. Also fails in 2.2.0. Found this bug …

  10. spark 2.4 Parquet column cannot be converted in file, Column ...

    Nov 28, 2019 · The vectorized Parquet reader enables native record-level filtering using push-down filters, improving memory locality, and cache utilization. If you disable the vectorized Parquet reader, …