Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-8202

PySpark: infinite loop during external sort

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.4.0
    • 1.4.1, 1.5.0
    • PySpark
    • None

    Description

      The batch size during external sort will grow up to max 10000, then shrink down to zero, causing infinite loop.

      Given the assumption that the items usually have similar size, so we don't need to adjust the batch size after first spill.

      Attachments

        Activity

          People

            davies Davies Liu
            davies Davies Liu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: