He took an example from here , changed only the baseDir folder, put the file in it. The error constantly appears: Exception in thread "main" org.apache.spark.SparkException: Job 1 : java.lang.NullPointerException. Swears on the last line, the desired file is not recorded.

public class StormReportsRecordReader { public static void main(String[] args)throws Exception { int numLinesToSkip = 0; String delimiter = ","; /** * Specify the root directory * If you are working from home replace baseDir * with the location you downloaded the reports.csv * file to. */ String baseDir = "/Users/storm/"; String fileName = "reports.csv"; String inputPath = baseDir + fileName; String timeStamp = String.valueOf(new Date().getTime()); String outputPath = baseDir + "reports_processed_" + timeStamp; /** * Data file looks like this * 161006-1655,UNK,2 SE BARTLETT,LABETTE,KS,37.03,-95.19, * TRAINED SPOTTER REPORTS TORNADO ON THE GROUND. (ICT),TOR * Fields are * datetime,severity,location,county,state,lat,lon,comment,type */ Schema inputDataSchema = new Schema.Builder() .addColumnsString("datetime","severity","location","county","state") .addColumnsDouble("lat","lon") .addColumnsString("comment") .addColumnCategorical("type","TOR","WIND","HAIL") .build(); /** * Define a transform process to extract lat and lon * and also transform type from one of three strings * to either 0,1,2 */ TransformProcess tp = new TransformProcess.Builder(inputDataSchema) .removeColumns("datetime","severity","location","county","state","comment") .categoricalToInteger("type") .build(); /** * Some code to step through and print the before * and after Schema */ int numActions = tp.getActionList().size(); for (int i = 0; i<numActions; i++){ System.out.println("\n\n==============================="); System.out.println("--- Schema after step " + i + " (" + tp.getActionList().get(i) + ")--" ); System.out.println(tp.getSchemaAfterStep(i)); } SparkConf sparkConf = new SparkConf(); sparkConf.setMaster("local[*]"); sparkConf.setAppName("Storm Reports Record Reader Transform"); JavaSparkContext sc = new JavaSparkContext(sparkConf); /** * Get our data into a spark RDD * and transform that spark RDD using our * transform process */ // read the data file JavaRDD<String> lines = sc.textFile(inputPath); // convert to Writable JavaRDD<List<Writable>> stormReports = lines.map(new StringToWritablesFunction(new CSVRecordReader())); // run our transform process JavaRDD<List<Writable>> processed = SparkTransformExecutor.execute(stormReports,tp); // convert Writable back to string for export JavaRDD<String> toSave= processed.map(new WritablesToStringFunction(",")); toSave.saveAsTextFile(outputPath); } } 
  • 3
    Possible duplicate question: What is Null Pointer Exception and how to fix it? - Alexey Shimansky
  • @ Alexey Shimansky, the question is not “where does NullPointerException come from,” but “why does it crash when working with a file”. - ߊߚߤߘ
  • @Arhad "What is Null Pointer Exception and how to fix it ?" - Alexey Shimansky

1 answer 1

Register the full path to the directory with the file ("./Users/storm/"), if you run through the studio, specify the working directory of the project.