pytorch: zero_grad vs zero_grad() - role of parenthesis?

I'm trying to train a pytorch model as follows:

start = time.time() for epoch in range(100): t_loss = 0 for i in range(100): optimizer.zero_grad scores = my_model(sent_dict_list[i]) scores = scores.permute(0, 2, 1) loss = loss_function(scores, torch.tensor(targ_list[i]).cuda()) t_loss += loss.item() loss.backward() optimizer.step() print("t_loss = ", t_loss)

I find that when I call "optimizer.zero_grad" my loss decreases at the end of every epoch whereas when I call "optimizer.zero_grad()" with the parenthesis it stays almost exactly the same. I don't know what difference this makes and was hoping someone could explain it to me.

optimizer.zero_grad is just a function name. you call the function with parenthesis
– xssChauhan
Jun 28 at 12:10

1 Answer
1

I assume you're new to python, the '()' means simple a function call.
Consider this example:

>>> def foo(): print("function") >>> foo <function __main__.foo> >>> foo() function

Remember functions are objects in python, you can even store them like this:

>>> [foo, foo, foo]

Returning to your question, you have to call the function otherwise it won't work.

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Search This Blog

Mgiyuk