Summary: I prefer to define Functional Programming as making a distinction between pure and impure code. With this definition, you can program functionally in any language. What differentiates the functional languages is how much help they give you to make the distinction.
There are a lot of conflicting definitions of Functional Programming out there. I'd like to share mine, which serves me well. It explains why Haskell is more functional than Scheme, and also how you can program functionally in a non-functional language like Java.
Functional programming means programming with a distinction between pure code and impure code. Pure code has no side effects. It's referentially transparent. It means the same thing every time you run it. Impure code contains side effects, so running it twice is different from running it once.
The distinction between pure code and impure code uniquely identifies functional programming and distinguishes it from other paradigms such as procedural and Object Oriented. Procedural is about modeling your solution as sequential steps. Object Oriented is about modeling your solution as communicating objects. Functional programming is about modeling your solution as pure functions.
Now, this definition is very practical. Notice that it's not about choice of language. You can write functional code in any language, just as you can code up an object system in C and say you're doing OO. The question is how much the language helps you write functional code or OO code.
On one extreme, you've got Haskell. There is no doubt that Haskell is a functional language. How does it help you write functional code? It has no mutable values and side-effects are confined to a single type:
IO. The language forces you to make the distinction between pure and impure.
On the other extreme, you've got machine code or assembly. At the lowest level, the language pushes you to avoid the distinction. All operations are about changing at least one location in memory. It could be a register or the top of the stack or something. But you are forced to change something. However, with a lot of super-human discipline, you could keep the distinction in your head. You might create a little heap and keep the discipline "a procedure can only write to memory it allocates directly". And this way, you make a bit of room for some functional programming. But that language is not giving you any help.
So why functional programming? Well, it turns out that knowing that running code twice will produce the same result makes it very easy to reason about it. And reasoning about code is basically our job as software engineers. What's more, the kind of reasoning you can do with functional programs can reach all the way up to the highest forms of reasoning, like math. That's where Haskell really shines. All of the category theory stuff (monads, functors, applicatives, etc) is an expression of that--mathematical concepts that are applicable in Haskell code.
That's it. That's my definition. The definition is inclusive yet gets at the essence. Functional Programming is a perspective that makes code easier to understand and maintain as it's being used in a system complex beyond your possible comprehension. And at its most sublime and abstract levels, Functional Programming approaches mathematical reasoning.
If you'd like to get started with Functional Programming in Clojure, you can do worse than using the LispCast Introduction to Clojure video course.