• Post author:
  • Post category:Scala
  • Post last modified:March 27, 2024
  • Reading time:8 mins read

In this article, we will learn how to validate XML against XSD schema and return an error, warning and fatal messages using Scala and Java languages, the javax.xml.validation package provides an API to validate XML documents, the same API can be used with Java and Scala languages.

Advertisements

First, we will create the following XML file and an XSD file and store them in the resources folder. We will run examples mentioned below in the Maven project and the same examples are provided at Github for your reference

XML File:

<?xml version="1.0" encoding="UTF-8"?>
<root>
	<records>
		<record>
			<title>Got to Be There</title>
			<artist>Michael Jackson</artist>
			<genre>pop</genre>
			<year>1971</year>
		</record>
		<record>
			<artist>Music  Me</artist>
			<genre></genre>
			<year/>
		</record>
		<record>
			<title>x</title>
			<artist>Music  Me</artist>
			<genre></genre>
			<year/>
		</record>
	</records>
</root>

XSD schema file which we validate against.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" attributeFormDefault="unqualified"
		   elementFormDefault="qualified">
	<xs:element name="root" type="rootType">
	</xs:element>

	<xs:complexType name="rootType">
		<xs:sequence>
			<xs:element name="records" type="recordsType"/>
		</xs:sequence>
	</xs:complexType>

	<xs:complexType name="recordsType">
		<xs:sequence>
			<xs:element name="record" type="recordType" maxOccurs="unbounded" minOccurs="0"/>
		</xs:sequence>
	</xs:complexType>

	<xs:complexType name="recordType">
		<xs:sequence>
			<xs:element type="titleType" name="title" />
			<xs:element type="xs:string" name="artist"/>
			<xs:element type="xs:string" name="genre"/>
			<xs:element type="xs:short" name="year"/>
		</xs:sequence>
	</xs:complexType>

	<xs:simpleType name="titleType">
		<xs:restriction base="xs:token">
			<xs:minLength value="5"/>
			<xs:maxLength value="60"/>
		</xs:restriction>
	</xs:simpleType>
</xs:schema>

Validate XML using Scala

Below is a complete Scala code which validates above-given XML with XSD schema and returns all error, warning, and fatal messages.


package com.sparkbyexamples.scala

import javax.xml.XMLConstants
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.{Schema, SchemaFactory, Validator}
import org.xml.sax.{ErrorHandler, SAXParseException}

object XMLValidatorUsingXSD extends App {

  validate("test.xml","test.xsd")

  def validate(xmlFile:String,xsdFile:String): Unit ={
    var exceptions = List[String]()
    try {
      val schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
      val url = ClassLoader.getSystemResource(xsdFile)
      val schema: Schema = schemaFactory.newSchema(new StreamSource(url.openStream()))

      val validator: Validator = schema.newValidator()
      validator.setErrorHandler(new ErrorHandler() {
        @Override
        def warning(exception:SAXParseException){

          exceptions = exception.getMessage  :: exceptions
        }
        @Override
        def fatalError(exception:SAXParseException ) {
          exceptions = exception.getMessage  :: exceptions
        }
        @Override
        def error(exception:SAXParseException ) {
          exceptions = exception.getMessage  :: exceptions
        }
      });

      val xmlUrl = ClassLoader.getSystemResource(xmlFile)
      validator.validate(new StreamSource(xmlUrl.openStream()))
      exceptions.foreach(e=>{
        println(e)
      })
    }catch {
      case ex => {
        ex.getMessage
      }
    }
  }
}

Let’s look at what’s happening at a few statements here, first, will create a schemaFactory instance using SchemaFactory.newInstance. The next step is to create a Schema object by calling the schemaFactory.newSchema() and takes the schema / XSD file as a parameter. next, create a javax.xml.validation.Validator instance by calling newValidator() method on schema object.

Finally, call a validate() method on a validator object by inputting XML file. We should also create ErrorHandler by an overriding warning(), error() and fatal() methods and pass this to setErrorHandler() on validator, which captures xsd messages. When you run this example, it returns all XSD error, fatal and warning messages in a list.

Output:


cvc-type.3.1.3: The value '' of element 'year' is not valid.
cvc-datatype-valid.1.2.1: '' is not a valid value for 'integer'.
cvc-type.3.1.3: The value 'x' of element 'title' is not valid.
cvc-minLength-valid: Value 'x' with length = '1' is not facet-valid with respect to minLength '5' for type 'titleType'.
cvc-type.3.1.3: The value '' of element 'year' is not valid.
cvc-datatype-valid.1.2.1: '' is not a valid value for 'integer'.
cvc-complex-type.2.4.a: Invalid content was found starting with element 'artist'. One of '{title}' is expected.

Validate XML using Java

Below example validates XML against XSD schema in Java example. This code would be similar to the above and just the syntax would be different.


package com.sparkbyexamples.java;

import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import javax.xml.XMLConstants;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
import java.io.IOException;
import java.net.URL;

public class XmlValidator {

    public static void main(String args[]){

        validate("test.xml","test.xsd");
    }

    public static void validate(String xmlFile, String schemaFile) {

        try {
            SchemaFactory schemaFactory = 
                    SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
            URL url = ClassLoader.getSystemResource(schemaFile);
            Schema schema = schemaFactory.newSchema(new StreamSource(url.openStream()));
            Validator validator = schema.newValidator();
            validator.setErrorHandler(new ErrorHandler() {

                public void error(SAXParseException exception){
                    System.out.println("Error => "+ exception.getMessage());
                }

                public void warning(SAXParseException exception){
                    System.out.println("Warning => "+ exception.getMessage());
                }

                public void fatalError(SAXParseException exception){
                    System.out.println("Fatal => "+ exception.getMessage());
                }
            });
            URL xmlUrl = ClassLoader.getSystemResource(xmlFile);
            validator.validate(new StreamSource(xmlUrl.openStream()));

        } catch (SAXException e) {
            System.out.println(e.getMessage());

        } catch (IOException e) {
            e.printStackTrace();

        }
    }
}

By running this program you will get the same output that you get with Scala.

Online Validators

In case if you want to validate XML’s very quickly, there are many online tools available. I usually use Free online XML validator. To validate very large files instantly on your system, you can utilize XML validator plugins on Notepad++ text editor. There are many other tools you can explore online.

Conclusion

In this article, you have learned how to validate XML files against XSD schema using Scala and Java languages

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium