My take on this is that it is probably not a true cipher (defined to be a letter by letter substitution). If it were, then with enough examples, the letter substitutions can be deduced. Especially, for instance, some of the code words are obviously people, and since many Italian names end in the letter "a", the last number substitution in these codes would probably stand for that letter.
If it is a whole-word substitution code, then there is much less chance of cracking it. It would still be possible, but one would need a great deal of external contextual information. If most of the code words refer to people, then the messages could be correlated in time with any known activities of other mob figures. However, for that to be effective, one would need an extensive collection of these messages, correctly dated. If this is not available, correlation with outside context would be nearly impossible.
A related consideration is this: How did his confederates know what the codes stood for? There are really only two possibilities: a "codebook" was distributed initially with word-number correspondences (and perhaps updated from time to time) or the code is generated algorithmically somehow, and both he and the recipients of his messages know the algorithm.
In the first case, the code is pretty nearly unbreakable without the codebook. This method was standard in espionage through WWII, and its only weakness is the vulnerability of the codebook to interception. There are two types of codebook, a single use codebook, where each page is discarded after use, or a reusable code. The first one is more secure, since if the code is broken for one page, the next message will still be unreadable, but the codebooks need to be replaced from time to time, which means personal contact or a drop-off, which are activities vulnerable to discovery. The reusable codebook does not have this trouble, but once the code is broken, all messages become readable until a new codebook can be issued.
In the second case, the code is inherently breakable given enough samples, because there must be some algorithmic method involving the letters of the word being encoded, which generates the code. However, if the code is very complex, the number of samples needed would be very high. The advantage of this method is that there is no vulnerable codebook, but one must take care not to let too many messages be intercepted. (As an interesting side-note, the famous Enigma machine used in WWII was an attempt to combine the advantages of both methods. Essentially, it was a machine for algorithmically generating single-use codes. With this system, one does not need to send out codebooks periodically, and one still has the security of a code that changes with each message. Alan Turing and company eventually cracked this code by deducing the algorithm that changed the code, enabling them to predict the next codebook page.)
One way to determine which type of code he was using is to look at his behavior after he was in custody. Clearly, he would not be allowed to keep a codebook on his person, and we can assume that no such book was found (otherwise, we would know already how to read his messages). If, after he was in custody, he was found trying to send a coded message, this would prove that he was NOT using single-use codebook, as in that case he would not know how to encode his message. The possibilities are only two: either he was using algorithmic encoding, or he had a standard reusable codebook, which he had memorized after years of use. The second possibility is very unlikely, since the codebook must be small enough to memorize, but at the same time, it must contain codes for a fixed list of commonly used words, such as names of other mob members, common activities, etc. It is hard to see how a list short enough to memorize would be comprehensive enough to be useful. But perhaps he has a very good memory!
On the other hand, if after he was in custody, he made NO attempt to send any coded messages, then there would be (weak) evidence that he was using an extensive codebook. In this case, the next thing to look for is whether the code referring to the same person ever changed. If two different messages can be proven (through external context) to refer to the same person, and yet the code numbers are different in the messages, then the code is periodically changing, and so he is either changing the algorithm or issuing new codebooks.
Which case is the truth depends on what he is observed to do, which at this point we do not know.
One final possibility occurs to me: It is always possible that the numerical code is a misdirection. Perhaps the real content of the message is hidden in the plain text portion. Perhaps merely including a PS is a signal, perhaps the word "goodbye" indicates that some job is accomplished, or perhaps Papa stands for a specific place or time, etc. There are many, many possibilities with this kind of word-substitution code, and it is very nearly impossible to crack. It'suffers very much from having a fixed and very limited vocabulary of encodable concepts. If he doesn't have many different types of things to say, or many different instructions to issue, then perhaps he could use such a code.