text generation using Estimator API

I have been trying to start transitioning to Estimator API since it is recommended by tensorflow people. However, I wonder how some of the basic stuff can be done efficiently with estimator framework. During weekend I tried to create GRU based model for text generation and followed tensorflow example for building custom estimators. I have been able to create a model that I can train relatively easily and its results match with non-estimator version. However for sampling (generating text) I faced with some trouble. I finally made it work but it is very slow, since every time it predict a character, the estimator framework load the whole graph which makes the whole thing slow. Is there a way to not load the graph every time or any other solution? Second issue: I also had to use state_is_tuple=False since I have to send back and forth the state of GRU (between model method and generator method) and I can't send tuples. Any one knows how to deal with this? thanks P.S. Here is a link to my code example: https://github.com/amirharati/sample_estimator_charlm/blob/master/RnnLm.py