Saturday, March 16, 2024

Spring Data JPA - Join Fetch, Entity Graphs - to fetch nested levels of OneToMany related data . . .

One-to-many relationship in Databases is quite common. It is also quite cumbersome in terms of how many aspects that need to be considered for getting it correctly implemented. Just to list few aspects - related JPA annotations, relationship keys specified in annotations, fetch modes, fetch types, joins, query types, performance, N+1 data problem, cartesian product, DISTINCT to eliminate duplicates, indexes etc. With Spring Data, JPA and Hibernate as the default implementation provider there are few JPA annotations, Hibernate specific annotations, JPQL queries, Java collection types, all these will get added to the mix.

Environment: Java 21, Spring Boot 3.2.3, PostgreSQL 15.3, maven 3.9.6 on macOS Catalina 10.15.7

If you have JPA entities related through OneToMany relationships two multiple levels, then some special care is required to fetch data in a performant manner by avoiding the classic N+1 query issue, or even multiple queries. Eash query is a roundtrip to Database and it adds up its own baggage.

Let's take a simple example of Country has many States, State has many Cities. We want to represent this relationship in JPA Entities and query using Spring Data JPA Repositories.

The entities with relationships look something like:

public abstract class BaseEntity implements Serializable { private static final long serialVersionUID = 1L; @Id @GeneratedValue(strategy = GenerationType.IDENTITY) @Column(nullable = false, updatable = false) @ToString.Exclude protected Long id; /** For optimistic locking */ @Version protected Long version; @CreationTimestamp @Column(nullable = false, updatable = false, columnDefinition = "TIMESTAMP WITH TIME ZONE") protected OffsetDateTime dateCreated = OffsetDateTime.now(); @UpdateTimestamp @Column(nullable = false, columnDefinition = "TIMESTAMP WITH TIME ZONE") protected OffsetDateTime lastUpdated = OffsetDateTime.now(); } @Entity public class Country extends BaseEntity { private String name; @OneToMany(mappedBy = "country", cascade = CascadeType.ALL, orphanRemoval = true) @Builder.Default @ToString.Exclude private Set<State> states = new LinkedHashSet<>(); } @Entity public class State extends BaseEntity { private String name; @ManyToOne(fetch = FetchType.LAZY) @JoinColumn(name = "country_id") @ToString.Exclude private Country country; // owning side of the relationship @OneToMany(mappedBy = "state", cascade = CascadeType.ALL, orphanRemoval = true) @Builder.Default @ToString.Exclude private Set<City> cities = new LinkedHashSet<>(); } @Entity public class City extends BaseEntity { private String name; @ManyToOne @JoinColumn(name = "state_id") @ToString.Exclude private State state; }

And a repository interface like:
@Repository public interface CountryRepository extends JpaRepository<Country, Long> { Optional<State> findByName(String name); }

One way to fetch all related data in a single query is by writing JPQL query with JOIN FETCH. This involves making sure to use all @OneToMany annotated properties to use Set and not List, and not using FetchType.EAGER and FetchMode.JOIN., and by writing a JPQL query with @Query annotation as shown below. Make a note of both DISTINCT and JOIN FETCH. This will result into one query which fetches for a Country all States, for each State all Cities data. If it is huge set of records, your best bet is to use @EntityGraph recommended. Lets say that our data is not huge and we want to use JPQL. In this case, the repository method annotatted with JPQL Query looks like:

@Repository public interface CountryRepository extends JpaRepository { @Query(""" SELECT DISTINCT country from Country country JOIN FETCH country.states state JOIN FETCH state.cities city WHERE country.name = :name """) Optional<Country> findByName(String name); }

Gotcha

  • If you use List instead of Set, you might bump into infamous HIBERNATE concept called bag and an exception like - MultipleBagFetchException: cannot simultaneously fetch multiple bags which will force you to read a whole lot of text in finding the information you need, digging through googled links and StackOverflow without much luck, and eventually breaking your head ;)
  • There are also other ways to tackle this problem. Writing native query in the @Query annotation is another way. I wouldn't go that route as I don't want to get sucked by databases. I am sure if you take that route, you will have someone around you ready to argue in favor of Sub Selects, Stored Procedures etc. ;). My personal preference is to stay away from diving deep into database,  avoid abyss ;)

Sample Spring Boot Application - Country-State-City

Here is the link to a sample Spring Boot 3.2.3 GraphQL application which has both @Query JPQL way and @EntityGraph way of getting the single generated query that is performant in fetching all related data in one roundtrip.

References

Saturday, March 09, 2024

Spring Boot - Java GraphQL - extended scalar types . . .

This is my first te(a)ch note on GraphQL. I had hit a couple road blocks in a few of days of my hands on journey with it. Unlike good-old-days when books were the primary source of learning that had everything documented, there is no single place to find all details these days.

Environment: Java 21, Spring Boot 3.2.3, PostgreSQL 15.3, maven 3.9.6 on macOS Catalina 10.15.7

Extended or Custom Scalar types

GraphQL specifies very limited set of well-defined built-in scalar data types (primitive data types): Int, Float, String, Boolean and ID. GraphQL systems must support these as described in the specification. Everything else is an extended or custom scalar data type.

That, obviously is a very limited set supported. All other data types need custom scalar implementations which basically require coercing values at run-time and converting those to Java run-time representation. Luckily, the Java ecosystem is so huge that you almost don't need to break the ground in doing so. You will always find few open-source libraries that have tackled it already for you. GraphQL Java Scalars is one such in this Java GraphQL for extended scalar data types.

The primitive data type set supported is just not enough. You at least need support for few other data types used in any Java application like: Long, UUIDDateTime etc. They all need special considerations in your application's GraphQL schema. The DateTime takes a very special seat. In fact, anything around Dates in Java always scares me. To humans, Date and Time are the most obvious types in day-to-day life, but non in Software systems. Date is the most abused type than any other data type. Just recollect how many billions of dollars of money was wasted on this one data type in 1998 and 1999 around the globe. After 23 years of learning the mistake, the Date is still not dealt easily; it is still a complex data type to deal with ;).

To use custom scalar types other than that limited primitive set, you have to look for writing code that handles serialization, parsing and literal parsing for each additional data type. The graphql-java-extended-scalars library provides implementation for many other data types.

With a maven dependency added for this library, all you need to do is to register a scalar data type  with RuntimeWiringConfigurer as described in the README. If you need to register multiple types, it's a builder, so you can just chain those like:

@Configuration @Slf4j public class GraphQLConfiguration { /** * Custom scalar support for UUID, Long, and DateTime. Registers extended scalar types used GraphQL query schema. */ @Bean public RuntimeWiringConfigurer runtimeWiringConfigurer() { log.info("Registering extended GraphQL scalar types for UUID, Long, DateTime..."); return wiringBuilder -> wiringBuilder.scalar(ExtendedScalars.UUID) .scalar(ExtendedScalars.GraphQLLong) .scalar(ExtendedScalars.DateTime); } }

In addition to this, specify these scalar types in your application's schema.graphqls schema specification like:
"Extended scalar types" scalar UUID @specifiedBy(url: "https://tools.ietf.org/html/rfc4122") scalar Long @specifiedBy(url: "https://ibm.github.io/graphql-specs/custom-scalars/long.html") scalar DateTime @specifiedBy(url: "https://scalars.graphql.org/andimarek/date-time.html") ...

You are good to go.

Note that the extended scalar type for Long is named as GraphQLLong by this library. But, you should use Long in your your schema when you specify it as shown above. The directive @sprifiedBy is recommended to be used by GraphQL specification and is also a good practice to follow. Never ignore good practices ;)

Gotcha

Java JPA -  Instant vs. OffsetDateTime

If you are dealing with DateTime, make sure that whatever the Java type used in your code, it complies with GraphQL specification that requires date time offset.

I initially used Instant type in my JPA BaseEntity class for two properties: createdOn and updatedOn that are mapped by Hibernate provided @CreationTimestamp and @UpdateTimestamp mapped to PostgreSQL column type TIMESTAMP WITH TIME ZONE.  I switched to OffsetDateTime type because Instant is not supported and will never be by this library due to it not complying with the specification for DateTime. Java's Instant, Date and LocalDateTime do not include offset.

OffsetDateTime is an immutable representation of a date-time with an offset. This class stores all date and time fields, to a precision of nanoseconds, as well as the offset from UTC/Greenwich.

TIP

PostgreSQL offers two date time types: timestamp, timestamptz (is abbreviation of timestamp with time zone).

The following query results tell the date time story on this day light savings day of this year (Sun Mar 10, 2024). I ran it on my local PostgreSQL 15.3 running in Docker container.

-- Ran the query on Mar 10, 2024 day light savings day at EST 5:13:13 PM, EDT: 17:13:13 select version(); -- PostgreSQL 15.3 (Debian 15.3-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit show time zone; -- UTC SELECT now(), -- 2024-03-10 21:13:13.956877 +00:00 (timestamp with time zone UTC) now() AT TIME ZONE 'EST' AS est, -- 2024-03-10 16:13:13.956877 (??) now() AT TIME ZONE 'EDT' AS edt, -- 2024-03-10 17:13:13.956877 (right) now() AT TIME ZONE 'CST' AS cst, -- 2024-03-10 15:13:13.956877 (??) now() AT TIME ZONE 'CDT' AS cdt, -- 2024-03-10 16:13:13.956877 (right) now()::timestamp AT TIME ZONE 'EDT' AS timestamp_without_tz, -- 2024-03-11 01:13:13.956877 +00:00 (wrong) now()::timestamptz AT TIME ZONE 'EDT' AS timestamptz; -- 2024-03-10 17:13:13.956877 (right)

Here is the DbFiddle playground to play with the above query.

That's it in this te(a)ch note, more might come in as I walk forward along this GraphQL path.

Sunday, March 03, 2024

Enums - all the way to persistence (revisited and revised for today's tech stack) . . .

About two years ago I blogged on this combination: Enums - all the way to persistence. Technology is moving at faster pace than ever before. Java's release cadence is moving at rapid 6 months cycle, every March and September. Spring boot catches Java and other technologies and moves along at the same pace as Java, every 6 months in May and November. Of course, the PostgreSQL database, Hibernate and even Maven build system keep moving as well, at their own pace.

The challenge for Java developer is to keep up with all the moving technologies. As software development requires talented developers with years of experience and knowledge, debates go on Artificial Intelligence (AI). Some who have moved away from coding strongly feel that the generative AI which currently is capable of generating even code will replace software developers. I don't believe in that, at least at this time. The add-on 'at least at this time' is only a cautious extension to that non-generative human statement). I have tried CodeGPT lately at work, a couple of times when I was stuck with things not working together as described in documentations and blog posts, asking it's generative trained intelligence to see if it would be able to be my copilot in those development situations. It couldn't really stand up the hype in anyway, and I had to go and figure out myself all those situations.

Enums persistence - is one such problem again I did hit roadblocks lately after two years. The only change is newer versions of all of these technologies. It required additional exploration of few things before arriving at a solution that worked eventually.

Environment: Java 21, Spring Boot 3.2.3, PostgreSQL 15.3, maven 3.9.6 on macOS Catalina 10.15.7

Spring boot 3.2.3 data jpa brings in Hibernate 6.4.4 dependency.

The same persistent model described in my earlier blog post: Enums - all the way to persistence would need the following changes for enums to work.

The DDL script requires an extra PostgreSQL casting as shown below for enums:

-- create enum type genders CREATE TYPE genders AS ENUM( 'MALE', 'FEMALE' ); CREATE CAST (varchar AS genders) WITH INOUT AS IMPLICIT;

In maven pom.xml, the spring boot version is 3.2.3 and the hibernate-types-55 dependency is not needed.

Changes to domain object Person.java are shown below (the @TypeDef annotation is not required):

... import jakarta.persistence.EnumType; import jakarta.persistence.Enumerated; import org.hibernate.annotations.JdbcTypeCode; import org.hibernate.type.SqlTypes; ... @NotNull @Enumerated(EnumType.STRING) @JdbcTypeCode(SqlTypes.NAMED_ENUM) Gender gender; ... }

Changes look simple after figuring out and making things to work, but finding things that work required a bit of exploration ;)

References