Nov 12, 2013

Building Hadoop 2.2.0

I am learning the new YARN and MapReduce brought by the stable version Hadoop 2.2.0, and thought the best way to find out how it works is by looking at the sources.


Prerequisites (copied from hadoop-common repository)


* Unix System
* JDK 1.6+
* Maven 3.0 or later
* Findbugs 1.3.9 (if running findbugs)
* ProtocolBuffer 2.5.0
* CMake 2.6 or newer (if compiling native code)
* Internet connection for first build (to fetch all Maven and Hadoop dependencies)


Environment


Linux: I am using a rather old 32bit Debian 6.0.6. 
debian@debian:~$ uname -a
Linux debian 2.6.32-5-686 #1 SMP Sun Sep 23 09:49:36 UTC 2012 i686 GNU/Linux

Java:  I have the newest (at the time this article is written) Java 1.7 installed
debian@debian:~$ java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) Server VM (build 24.45-b08, mixed mode)


Build and install the protocolbuffer-compiler 2.5.0


The newest and the one required by Hadoop is version 2.5.0. This is only available in debian experimental repository (at this time), and I could not get it installed via apt-get. If your Linux distribution provides 2.5.0 from software repository, use that one.

First you are going to need g++ installed. My virtual machine was really pure in terms of installed software, so I had to install g++ first: 

  $ aptitude install g++


  $ tar -xvzf protobuf-2.5.0.tar.gz
  $ cd protobuf-2.5.0
  $ ./configure --disable-shared #[1]
  $ make install

The above commands compiled, built and hopefully installed protoc into /usr/local/bin/protoc .


Install Maven 3.0+


Choose a 3.0+ version from link below. I used 3.1.1, the newest one available at the time this article written. http://maven.apache.org/download.cgi

You need a binary tar.gz: 

Put Maven to its place:
  $ tar -xvzf apache-maven-3.1.1-bin.tar.gz
  $ mkdir -p /usr/local/maven/
  $ mv apache-maven-3.1.1 /usr/local/maven
  $ ln -s /usr/local/maven/apache-maven-3.1.1 /usr/local/maven/current

Put a symlink into /usr/sbin
  $ ln -s /usr/local/maven/current/bin/mvn /usr/sbin/mvn

In fact, this is the same way, how you install the Oracle JDK/JRE. The other way is to put the .../bin folder of the appliction on the $PATH variable at the end of /etc/bash.bashrc.


Install Git


This is available from repository:
  $ aptitude install git


Clone hadoop-common


Go to your Eclipse workspace, or create one if you don't have any. I put it into my home:
  $ mkdir -p ~/Development/workspace_eclipse_java

Clone the git repository:
  $ git clone https://github.com/apache/hadoop-common.git hadoop-common


Install hadoop Maven plugin


Hadoop has it's own maven plugin to do stuff:
  $ cd hadoop-maven-plugins
  $ mvn install


First build everything


I found the project setup and build well documented. Everything is written down in the BUILDING.txt [2] 

First you need to build the whole hadoop-common to allow Maven caching the dependency jars in your local repository. That way, eclipse will be able to resolve all your inter-project dependencies.

  $ cd hadoop-common
  $ mvn install -DskipTests -nsu #-nsu means something cache forever


Generate Eclipse projects


I am only interested in YARN and MapReduce components, so I will:
  $ cd hadoop-yarn-project
  $ mvn eclipse:eclipse -DskipTests


Set M2_REPO variable in Eclipse


If not yet set, you have to create a variable in eclipse pointing to your local Maven repository, as every dependencies in the generated .classpath file start with M2_REPO/..

  [Window] => [Preferences]
  Java -> Build Path -> Classpath Variables

Add a new one named M2_REPO pointing to your Maven local repository, which by default is at   /home/username/.m2/repository


Import projects into Eclipse


  [File] => [Import]
  General -> Existing Projects into workspace
Set your root directory to the hadoop component you want to import. In my case it's 
  hadoop-common/hadoop-yarn-project/hadoop-yarn

I highly recommend creating working set to every hadoop component, since they all consists of several eclipse projects.

Enjoy!



[1] http://www.coderanch.com/how-to/java/ProtocIndependentBinary

[2] https://github.com/apache/hadoop-common/blob/trunk/BUILDING.txt

Aug 16, 2013

Demistifying JSF lifecycle

(Mojarra 2.1.25)

17-08-2013: I made the source available, see below

There is two sections of this post. The first part is about how JSF lifecycle works, the second part is about showing events and action invocations on diagrams.

What is a view state?

JSF devs are more or less familiar with the JSF lifecycle. I will just quickly show you how it looks, to refresh your memory.

JSF lifecycle

It took quite long to me to clearly understand what is going on under the hood. Namely, what is restoring the view and why JSF has to set values in the application twice: at Apply request values phase and then at Update model values  phase.

When JSF handles a GET request, it first looks at the xhtml where the request points to. It parses that facelet, traverses every template definition, ui:include and components that can be reached and builds up an in-memory representation of components called the view tree. That is right, it happens on every request, that is why a huge view tree makes the application slower. One solution would be if the view tree was taken from a pool for every request as Rudi Simic suggests in this article: Stateless JSF – high performance, zero per request memory overhead -  http://www.industrieit.com/blog/2011/11/stateless-jsf-high-performance-zero-per-request-memory-overhead/ . It is a great article that teaches a lot of how JSF works.

Saving the view state - When you access a page that contains at least one <h:form>, the view state is saved in the session before the request ends. This in fact is some representation of the current view tree. Using an older, 2.1.2 Mojarra reveals this behavior: the JSESSIONID cookie is only created when you encounter a page that loads either a form or uses a @SessionScoped backing bean. In Mojarra 2.1.25, the JSESSIONID cookie is created regardless. In fact, you see the id of the view state at the end of the forms in a hidden input field.

Restoring the view state - On postback from a JSF form, the view state is restored. Every data of the view tree that has been persisted for this view or facelet, and attached to this view state id, is read  from the session and is used to restore the view tree.

<input type="hidden" name="javax.faces.ViewState" id="javax.faces.ViewState" value="-4005688715831258364:-4133587133831516823" autocomplete="off">

The reason for restoring the previous view tree for POST requests is to make sure every component is in place when applying the request values and updating the model values and calling any component event listeners. At first it's not obvious why this is necessary, but the view tree can change from request to request dynamically, and JSF has to make sure, postback of a form arrives to the same view tree as it saw when left the server.
Suppose that you dynamically add some input fields to a form in the invoke application phase or render response phase. After a postback, you could add them again, but only after apply request values and update model values phase. JSF still has to invoke the setters on the value attribute of those input fields and call any component event listener, therefore it restores those dynamically added components as early as possible. The only thing that you have to know now is that values of <h:input> tags are stored in the view state in the session on the server. So form data in fact is stored in two places: bound to your backing beans, and in the UIComponent representing an input field.

Applying request values - When the view state is restored in the components, then JSF sets the values of the components to the ones that came from the request as POST params. This behavior is the responsibility of the decode() method of the UIComponent's Renderer. For example, if you create your own paginator component that goes to the next page when the request value pagerClientId:page comes, this is the place where you inform your component about the new value.


Updating model values - This late phase is where the site developer first (most of the time) interacts with JSF. This is where the setters of the backing beans are called.

Invoking application - You should be familiar with this phase. As seen in the diagram above, this is where actions and actionListeners are invoked regardless of it is an AJAX request or a full page request.

Rendering the response - JSF calls the encode() method on the ViewRoot that encodes the in-memory view tree representation into an HTML represenation for sending back to the client. Every component is responsible for calling encode() method on their children, practically traversing the tree dfs. Additionally the view state is saved in the session, if there is a form element in the tree.

What happens when?

Next thing that takes much practice is to know when actions, listeners, actionListeners, lifecycle phase listeners, component phase listeners, setters, getters, valueChangeListeners are invoked. I created a very simple ajax form with various bindings to my application: a backing bean, a converter, a validator, component phase listeners and a lifecycle phase listener. The goal is to create a clear view of what get's called and when.
  • red events are from the lifecycle phase listener and show JSF lifecycle before and after events
  • blue events come from the declared component lifecycle phase listeners
  • green events are the ones that are calls to backing beans or custom converters, validators
Of course this is a very simple form and I haven't tested several things. For example when component phase listeners are invoked on the commandbuttons. Or whether or not the PreRenderView listener is invoked when the component is not rendered (probably not). 

I put arrows coming from the xhtml to the events to give you some pivot about which event comes from which declaration.

Sorry about the kinda' bad quality. Click on the links below the images to see them in big.

Sources on github: https://github.com/pkonyves/jsf-lifecycle-explained
Download as ZIP link: https://github.com/pkonyves/jsf-lifecycle-explained/archive/master.zip
Building instructions are in README.md

First page load



When you use <f:viewParam> parameters, the FULL lifecycle is triggered with APPLY_REQUEST_VALUES through INVOKE_APPLICATION phase. That is, because viewParams are implemented as UIInput elements therefore it requires these phases.

Partial ajax request executing only the 'number' input field


render: @all, execute: number -http://i.imgur.com/YurYAYj.png?1

Some remarks:
Value change listeners are invoked before setters and Action listeners are called before actions. This means, when the value change listener gets called, the new value is not set in the backing bean, of course the javax.faces.event.ValueChangeEvent makes it easy to retrieve the old and new values. And not suprisingly value change listeners are only invoked, when the value changed, on the other hands, setters are invoked every time.

All of the pictures were take in one session, and by the object IDs, you can see that the components were recreated on every request. I haven't tested with MyFaces, maybe that is smarter.

Ajax request executing the whole form


render: @all, execute:@form - http://i.imgur.com/7TfTDBn.png?1


Apr 18, 2013

Exceptions and Transactions in EJB, dealing with EJBException and EJBTransactionRolledbackException

When an exception is thrown from an EJB, several things work very differently from what you got used to in Java SE environment or basically outside of the EJB scope.

Things you have to consider:
  • Some exceptions thrown from the EJB scope trigger transaction rollback, other ones don't
  • Some exceptions coming from EJB get wrapped in an EJBTransactionRolledbackException, others don't
  • Whether or not the exception came from "nested" EJB method calls

Basic setup

You call an EJB method that has it's transaction attribute set to Required or RequiresNew. Two states distiquish this scenario from other scenarios: transaction is different from caller's transactional state, there is a transaction on the callee side.

Unchecked (java.lang.RuntimeException) exceptions

When a java.lang.RuntimeException is thrown from the myClientEjbMethod() method, the container will 1) roll back the transaction 2) discard the EJB Bean instance 3) Throw a javax.ejb.EJBException to the client. Every database changes made in your EJB call will be rolled back.

When you know what kind of exceptions you expect in your web layer, you have to unroll the EJBException to dig into the real reason, and show it to the user if it's meaningful to her. These kind of exceptions would be specialized runtime exceptions you defined (let's say BusinessValidationException extends RuntimeException).

Digging into the real reason is a simple method:

Throwable unrollException(Throwable exception,
  Class<? extends Throwable> expected){

  while(exception != null && exception != exception.getCause()){
    if(expected.isInstance(exception)){
      return exception;
    }
    exception = exception.getCause();
  }
  return null;
}

Checked (java.lang.Exception) exceptions

When a java.lang.Exception is thrown from the EJB method, you obviously have to declare it in the method's signature with the throws keyword. What happenes in this case as follows 1) the transaction gets commited 2) the exception is rethrown to the client. ...Yes, that's it. No rollback occurs, and the client sees the exact same exception as was thrown. In EJB terms a checked exception is an ApplicationException. The name indicates that this exception means a problem, the application developer is aware of. I can think of two types I would define for my application:
  • BusinessValidationException extends java.lang.Exception
  • BusinessLogicException extends java.lang.Exception
Because you, as the application developer throw the ApplicationException on purpose, you have to decide   if the transaction has to be rolled back, or the flow can be continued, and maybe throw the exception at the end of the method to indicate some problem happened during processing but it actually executed successfully.

To force the rollback manually for an ApplicationException, before throwing it, you have to get a reference to the EjbContext, and call EjbContext.setRollbackOnly().

You also have to be conscious about whether you want to just rethrow a FileNotFoundException, ParseException checked exceptions from various utilities. They don't cause a rollback, when simply rethrown from the EJB method, because they have to be present on the method signature, therefore are ApplicationExceptions. The EJB specification recommends that in this case, you wrap these exceptions into a javax.ejb.EJBException. That is because these are likely to be errors out of the developers scope, and you cannot continue the transaction, so you just rethrow it packaged in an EJBException runtime exception: becaus runtime exceptions do cause rollback.

For example:


Date getDateFromStringEjbMethod(String yearString){
  SimpleDateFormat dateFormat = new SimpleDateFormat("YYYY");
  try{
    return dateFormat.parse(yearString);
  } catch(ParseException e){
    throw new EJBException(e);
  }
}

Design your business logic exceptions

I feel it to be a burden to call EjbContext.setRollbackOnly() every time I throw my little BusinessValidationException. I might also forget to call it, resulting in a commit and inconsistent DB state.
You have three sensible options to define exception types that mean an error in the business flow:
  1. Inherit your BusinessValidationException from java.lang.RuntimeException:
    Transaction is rolled back automatically, but you have to unroll the exception on client side, because it will be wrapped into an EJBException
  2. Inherit your BusinessValidationException from java.lang.RuntimeException (same as previous), and annotate this class with @ApplicationException(rollback=true)
    What you gained is, you will get the same exception on client side, not a wrapper, and the transaction is still rolled back.
  3. Inherit from java.lang.Exception and annotate it with @ApplicationException(rollback=true)
    This is the same as 2) except, you can warn the client to be aware of a business kind of exception to handle in a user-friendly way.
  4. 2) or 3) with rollback=false attribute for problems where the transaction can be continued/committed, but an indication should be sent to the user that there was a problem
When I throw business workflow error or validation kind of exceptions from my EJB methods, I always try to define a meaningful message, not a very technical one and show it to the user, instead of letting the exception bubble up, and let the servlet container show the standard error message.


Exceptions from nested EJB calls



When you call myOtherEjbMethod() from myEjbMethod(), things change a little. The container's behavior for myOtherEjbMethod() will be the same for the ApplicationException and RuntimeException as described previously i.e.: If an ApplicationException is thrown, you have to decide if the myOtherEjbMethod() will be rolled back or not. If  a runtime exception occurs, the transaction gets rolled back regardless (If that RuntimeException is not annotated with @ApplicationException).

But what will you see in the caller myEjbMethod()? In nested calls, myEjbMethod() is the client, and the container may wrap the actual exception into an javax.ejb.EJBException or javax.ejb.EJBTransactionRolledbackException. The rules are (I think) intuitive after you learned the behavior in the simple case in the previous section.

When myOtherEjbMethod() uses the client's ( myEjbMethod() )  transaction, the container will wrap the RuntimeException thrown from nested method into the javax.ejb.EJBTransactionRolledbackException. Why? Because this way you will know, that continuing the transaction in myEjbMethod() is "fruitless" (as the EJB 3.1 specification pens it)

When myOtherEjbMethod() uses a new transaction (with RequiresNew transactional attribute), in case of a RuntimeException, the container will wrap it in a EJBException. You will know, that it caused the new transaction to roll back, but you can continue the outer transaction. This is actually what RequiresNew is good for.

Sum it up

EJB makes a difference in Application Exceptions and System Exceptions. Application exception is something that you define, you throw, and you are aware of. By default the application exception does not cause a rollback, unless you define it that way (and I think it's recommended). Every checked exception that is mentioned in the method signature and also any checked or unchecked exception that is annotated with @ApplicationException, is an application exception.

System exceptions happen in cases, you don't control, and they are unchecked exceptions. They always cause rollback. Good practice is, if you wrap checked exceptions -- that cannot be avoided -- in your method into EJBException e.g. ParseException.

Mar 22, 2013

How to design clean business tier with EJB and JPA

Prologue


In this post, I will drive you through the mentality and ideas of designing good EJB APIs. It is kind of long, but I want you to understand the ideas behind every choice.

Java EE was designed based on the observation that most business applications are built on a 3-tier architecture:
  • Data Access Layer (JPA, JDO, JDBC, proprietary NoSQL APIs)
  • Business Logic Layer (EJB, CDI)
  • Presentation Layer (Servlet, JSF, JSP, JAX-RS, JAX-WS)
Although the Java EE stack APIs are fairly well designed, they are also very complex, and have shallow learning curves. Not only each technology is hard to learn, one also has to understand how they cooperate. 

I like the phrase Separation of concern. Good software architecture is born only when these three words are constantly in the mind of the architects and developers. The 3 tiers are exactly based on separation of concern of the program components. A good EJB design is based on whether or not you understand the concerns your application as to address.


Design your software architecture through use-cases


The advantage of using JPA, EJB and JSF only gets clear (and does not become a overwhelming) when you first sit down at your desk and start to think of the use-cases of your software. These are basically user interactions to your software. You must have a very clear definition of at least a subset of correlating use-cases for your software.

Example:
You have to create  a software for a tennis club. The users are all going to be players, and twice every year there is a championship. The championship is among teams. A player can be a member for more teams. The teams are only permanent for one championship, they are reformed for every new championships. A player assigns herself for a team (or more). Of course there are many matches in a championship. It's the role of the administrator to create a championship, declare teams, create matches, and administer match results once they are played. The role of users are to see the matches, championship results, and to assign themselves for a team.

This is only a very simplified specification, but making it a software, takes much much consideration. You have to identify the exact use-cases such as:
  • administrator creates championship
  • administrator modifies championship attributes
  • administrator creates a match (when? who will be the opponents? what are the minimum attributes for a match?)
  • administrator modifies match attributes
  • administrator assigns team players to matches as opponents
  • replace players across teams
  • ...
Why do you need to define them so explicitly? Because you have to understand and partition the problem domain. If you don't know the problem enough, you will not be able to separate what belongs to the business logic layer and what is to the presentation layer. And this is a key to create good EJB API.
Know well your problem domain and use-cases!

Atomic operations and consistency (EJB)


The brain of your software sits in your EJBs. One of the most important facilities of the EJBs for us is:
  They are transactional by default (regarding RDBMS or messaging services). When you reach your database from an EJB via most probably JPA, you already have a transaction for the database session. As a consequence, every EJB operation is atomic toward the underlying database. Either every DB changes in the EJB takes place, or none.
You must map each use-cases you found to exactly one EJB method calls from the web gui! This is cruical to keep your data consistent.
Trivial Example:
The last use-case was to have a player, get her out from team A and put her into team B instead. A broken pseudo-code from the web layer may look like this.

@EJB
ChampionshipBean championshipBean;

public void transferPlayer(Player player, Team toTeam) {
  Team currentTeam = championshipBean.getCurrentTeam(player); // 1. 
  championshipBean.deletePlayerFromTeam(player, currentTeam); // 2. 
  championshipBean.assignPlayerToTeam(player, toTeam);        // 3.
  refreshView();
}

One (among many) way for this code to go wrong is when after you deleted player from her team (2.), toTeam is deleted by another administrator who is messing with the software concurrently, thus (3.) will throw an exception saying "no such team as toTeam". The result is that player gets deleted from her currentTeam, but not assigned to another. When the user sees the error "no such team as toTeam", he expects the player to be in the team as before the change, but she is gone. This is errornous behavior.

If you put the method transferPlayer(Player player, Team toTeam) into an EJB method, after the same scenario, the DB will be in a consistent state, because either every change takes place or none, in an atomic operation.


Different call scenarios for the same use-case


Hanging with the pervious example, suppose you want to expose a REST interface to your application, or just create a different view for the player transfer option on your web gui. You need the same behavior initiated from totally other context. This is a convenient situation to see if you created a good EJB design or a crappy one. In the latter case, you will see, that you exposed too much business logic in your presentation layer: connected JPA entities together via setters; made changes to couple second old detached entities that come from SessionScoped JSF ManagedBeans; made more EJB calls that change your database for the same use-case.

When you encounter any of the above patterns, remember from the first section separation of concern. The use-case should be implemented once in the EJB, then it can be called from as many places as you whish.

A clean EJB API


JPA Entity parameters?


I was struggling a lot whether to pass entites to EJB method parameters, or pass only entity attributes.

@Stateless
public class ChampionshipBean {
  public void assignPlayerToTeam(Player player, Team team) { .. }
  // vs.
  public void assignPlayerToTeam(Integer playerId, Integer teamId) { .. }

  public void changeMatch(Match match) { .. }
  // vs.
  public Match setMatchPoints(Integer matchId, List<Integer> points) { .. }
}

Then I recalled that we had done in PHP was passing only database record primary keys to the forms in hidden input fields to identify those records after a form POST. And it's the same when we use JSF. The only difference is that the JSF framework hides this behavior, and we can keep whole entities in the memory in @SessionScoped ManagedBeans across requests. However, we often do not even need to keep whole entities or list of entities in the memory in the web layer, because we only want the primary key or we transform them into new POJOs that can be best used in a dataTable much easier. We still need the entitiy's primary key to identify the original identity for our operations though.

My conclusion was that: 
In most cases it is totally enough to pass only the primary key of an entity back to the EJB and the parameters that has to be changed.
Because:
  • often we don't need to keep whole entities in web layer (presentation layer)
  • if we don't have the original entity in memory in web layer, a method call like assignPlayerToTeam(Player player, Team team) requires us to first call a method getPlayerForId(Integer playerId) then pass the returned Player to the preceding method. This is totally unnecessary and annoying
  • If we pass whole entities to the EJB methods, they are detached, so every time they have to be attached again with entitiyManager.merge(player), this does not seem to be a lot trouble, but when you have complex business logic and make use of nested EJB calls, you will not want to unnecessarily merge your entities all the time just because you are not sure if the parameter was attached or detached.
  • When you want to use your EJBs remotely because you deploy on a DAS cluster, you want to eliminate every unnecessary trafic overhead.


Unambiguous side effects of EJB methods


The previous reasoning was more of a technical one, but there is a semantical reason as well. When you pass an entity to an EJB method, you cannot make sure by heart that what is going to be persisted into the database. Get back to this example:

@Stateless
public class ChampionshipBean {
  public void assignPlayerToTeam(Player player, Team team) {
    player.setTeam(team);
    List<Player> players = team.getPlayers();
    players.add(player);
    entityManager.merge(player);
    entityManager.merge(team);
  }
}

This seems reasonable right? Well, it's totally wrong! The developer on the presentation layer will not know what side effects are gonna take place besides that the two entities are going to have a relashionship. He might think, he can also changes the name of the player with player.setName("Bob"); before calling the method above, because the entity will be merged anyways, so the new name will be persisted sure.
  But you, who implemented assignPlayerToTeam(Player player, Team team)  might have been thoughtful, and made sure these kind of side effects cannot happen: the method does not change anything beside setting the relationship between the two entities. Clear out this ambiguous behavior, you will use only primary keys as parameters: assignPlayerToTeam(Integer playerId, Integer teamId). If you still want pass whole entities as arguments, be clear in the function javadoc, how the function behaves, end be coherent with similar functions!


Changing simple Entity properties as opposed to entity relationships

What does public void changeMatch(Match match) do? You can create method like this, but a better name would be changeMatchProperties(Match match). It suggests that the method allows to change simple properties such as match#startTime or match#place but does not allow to create or delete relationships between entities: match.setPlayerOne(player); will simply have no effect at all.

You have to create separate EJB methods for updating an entity's properties and for updating relationships between entities. Mainly because the relationship for JPA entities must be set on boths sides, then both have to be merged. A defensive solution for changeMatchProperties(Match match):

void changeMatchProperties(Match match) {
  Match attachedMatch = entityManager.find(match.getId, Match.class);
  attachedMatch.setStartTime(match.getStartTime());
  attachedMatch.setPlace(match.getPlace());
  // set other properties
  entityManager.merge(attachedMatch);
}

Some rules of thumb

  • the name must be clear about what exactly the method does
  • follow consistent naming conventions in your code to differentiate: CRUD operations, entity relationship handling and more complex functions.
  • if entities are passed as parameters, only allow the changes that are suggested in the method name, no unexpected side effects. Prohibit entity relationship targeted changes in changeEntityProperties named methods.
  • create methods for entity relationship handling that does not allow changing any other entity attributes, only attaching entities
  • pass only primary keys as much as possible in favor of passing entities
  • never ever try to pass back information from EJB by the use of references in the arguments! When you ever want to call your methods remotely, you will surprised


Don't create entity facades for CRUD operations as EJB

See http://weblogs.java.net/blog/felipegaucho/archive/2009/04/a_generic_crud.html point 5. This is a good pattern as a Data access layer to use in your EJBs! However, think about in a decent project you could have many tens of entities. Do you really want to create 10-20 EJBs just for CRUD operations which will be only a small part of your complex use-cases? When you modify match attributes or set points for matches, you must enforce constraints in your EJB as I described in Unambiguous side effects of EJB methods. Simple create, read, update, delete operations of single entities will be rare, because you usually don't  just delete a team, you have to deal with it's matches you already created. So think before you start factoring these kind of facades for yourself, and see if your really need all of them or only a subset. Don't create unnecessary methods in your EJB API. They cause confusion.

Accordingly, you don't create EJBs for each of your entities! You create EJBs for groups of your problem domain: managing matches, managing users, managing championships

Designate one EJB called e.g. RepositoryBean to read-only operations. Very unlikely that on a webpage you only need to present one entity. You usually have te request several entities to present data on your page. Think about if it's easier to inject only one EJB into the presentation layer or a bunch of them? 

How does "grouping the problems into different EJBs" and "designating one super EJB" not contradict? It can... When you have to use queries that is very specific to one set of the problems e.g. findAllMatchesOfOpponentTeams(Integer teamAPk, Integer teamBPk); you may put it into the ChampionshipManagedBean. Because they are logically cohesive. But for single entity or list of entity requests you will be glad not to inject 2-3-4 EJBs into your presentation layer just to present one page. On the other hand, when updating your database, it's easier to find the right EJB method within a single, problem specific class.
When you have to decide which method goes into which EJB, think about the cohesion of your methods. Ask yourself: "Am I going to use these methods together one after another? Are these methods address a single problem group, so I better find them in one place?"
http://en.wikipedia.org/wiki/Cohesion_(computer_science)#Types_of_cohesion


Public and private EJB API

The preceding examples were very simple CRUD operations. But what to do when it comes to more complex use-cases?  You can create reusable EJB methods that can be called from other EJB methods. It would also be a good practice to put different EJBs for public or private use into different packages.(Unfortunatelly it's not allowed to declare EJBs as package private, neither EJB methods). That way you can make reusable EJB methods that are intended to be used from other EJB methods, but not from client code (presentation layer). For example you are allowed to pass whole entities to nested EJB method calls instead of only the primary keys, and you don't have to worry about merging the entities to the entity manager within those called methods. But it has to be well documented and clear which EJBs should be used by the client.

Please share your thoughts! :)